Test automation historically started with ideas to increase speed and reduce human factors in checks, but it quickly became apparent that automated tests often behave differently on each run. Repeatability and determinism are some of the basic requirements for the quality of automated tests, which must yield the same result under the same conditions.
Problems arise from implicit dependencies: unstable test data, unsynchronized environments, parallel processes, or external services. This leads to flaky tests — their results are unpredictable.
Solutions revolve around strict control of the execution environment, isolation of tests, mock/stub objects, static data, and reproducible scenarios (e.g., clearing and preparing the database before each test).
Key features:
What to do if a flaky test occurs only in CI, while everything is stable locally?
This is almost always related to differences in environments: dependency versions, infrastructure speed, parallelism, OS settings, or the order of test execution. The solution is to bring the CI environment as close as possible to the dev machine (Docker, identical seed values, setting up setUp/tearDown in tests).
Can parameterized tests be considered completely deterministic if the data is taken from a database?
No. Even if the data fundamentally matches, the database can change between tests or releases. For true determinism, the data must be prepared and cleaned explicitly in each test.
If using sleep to wait for elements to load, will this solve the instability problem in UI tests?
No. Sleep merely masks the problem and slows down test execution. It is appropriate to use explicit waits, which wait for specific conditions, rather than a fixed time.
In the project, UI tests were run on an environment that nobody cleaned after manual testing. Every few nights, the tests would fail with "random" errors that were not present locally. The team added sleep to the tests and ignored the flaky issues.
Pros:
Cons:
After a mature DevOps engineer joined the team, scripts were implemented to reset and initialize test data, mock services were added for unstable integrations, and tests were run in containers. Flaky tests virtually disappeared.
Pros:
Cons: