Answer.

Historically, software performance was tested after the main functional part — developers either executed scripts manually or collected metrics using tools like JMeter. With the mass transition to DevOps and CI/CD, there arose a need to automate these processes and obtain metrics at each stage of delivery.

The problem is complicated by the fact that load automation is not just running ready-made tests, but finely tuning load scenarios, reproducing user profiles, emulating real networks, considering latency, and the limitations of testing hardware.

The modern solution involves the implementation of specialized tools (e.g., Gatling, Locust, k6), creating scenarios with parameterization of user profiles, integrating performance testing into CI pipelines, automating the accumulation and analysis of metrics, and setting alerts for performance degradation.

Key features:

Correct configuration of load scenarios (repeatability and closer to reality).
Metric analysis (separating benchmarking, stress, long intervals) and automating their collection.
Assessing the impact of testing results on overall delivery quality and SLA compliance.

Tricky Questions.

Is it true that automation of "load testers" is only permissible in production?

No. Performance and stress automated tests can be performed on a dedicated stage application/stand to avoid violating SLA. Automation is preferable before going to production.

If load automated tests pass, can one be confident in the real user experience?

No — automated tests provide only an averaged picture. The behavior of real users may differ due to network conditions, platforms used, and other factors that are difficult to emulate exactly.

Should one only focus on the average response time values?

No. It's extremely important to consider the percentile (e.g., 95th, 99th), as averages can be skewed by outliers, and tail values often impact UX the most.

Common Mistakes and Anti-Patterns

Focus only on simple scenarios like "login/logout" without emulating business operations.
Ignoring the analysis of worst-case scenarios (tail latency).
Insufficient analysis of dependencies (e.g., not turned-off third-party services and rate limits).

Real-Life Example

Negative Case

A company implemented performance automated tests only for system login: scripts executed 1000 logins, averaged response time was analyzed, and they concluded that the problem was solved. At the first real launch, there were mass timeouts — it turned out that parallel "heavy" business operations were not taken into account, and the API crashed under load.

Pros:

Quick confirmation of the functionality of trivial scenarios.

Cons:

Ignoring crucial user chains led to an incident.
Falsely perceived stability.

Positive Case

In another team, the entire load profile was based on monitoring production, with separate scripts emulating peak activity from various devices and networks. All results were automatically compared to the baseline, deviations over 5% triggered an alert and paused the release.