Answer.

Test automation historically started with ideas to increase speed and reduce human factors in checks, but it quickly became apparent that automated tests often behave differently on each run. Repeatability and determinism are some of the basic requirements for the quality of automated tests, which must yield the same result under the same conditions.

Problems arise from implicit dependencies: unstable test data, unsynchronized environments, parallel processes, or external services. This leads to flaky tests — their results are unpredictable.

Solutions revolve around strict control of the execution environment, isolation of tests, mock/stub objects, static data, and reproducible scenarios (e.g., clearing and preparing the database before each test).

Key features:

Control of test data and predictable initialization of the test environment.
Isolation of tests from each other and abandonment of inter-test dependencies.
Use of Mock/Stub objects for working with unstable or external services.

Trick Questions.

What to do if a flaky test occurs only in CI, while everything is stable locally?

This is almost always related to differences in environments: dependency versions, infrastructure speed, parallelism, OS settings, or the order of test execution. The solution is to bring the CI environment as close as possible to the dev machine (Docker, identical seed values, setting up setUp/tearDown in tests).

Can parameterized tests be considered completely deterministic if the data is taken from a database?

No. Even if the data fundamentally matches, the database can change between tests or releases. For true determinism, the data must be prepared and cleaned explicitly in each test.

If using sleep to wait for elements to load, will this solve the instability problem in UI tests?

No. Sleep merely masks the problem and slows down test execution. It is appropriate to use explicit waits, which wait for specific conditions, rather than a fixed time.

Common Mistakes and Anti-patterns

Using "magic" numbers and delays instead of proper synchronization.
Inter-test dependency or changes in the environment state that are not rolled back between tests.
Mixing test data between scenarios.

Examples from Life

Negative Case

In the project, UI tests were run on an environment that nobody cleaned after manual testing. Every few nights, the tests would fail with "random" errors that were not present locally. The team added sleep to the tests and ignored the flaky issues.

Pros:

Did not have to spend much time on root cause analysis.
Tests sometimes still allowed finding bugs.

Cons:

Time spent on reruns.
False negative results demotivated engineers.
Reports were filled with noise, and actual regression could be missed.

Positive Case

After a mature DevOps engineer joined the team, scripts were implemented to reset and initialize test data, mock services were added for unstable integrations, and tests were run in containers. Flaky tests virtually disappeared.