History of the question:
Test automation is closely related to the need for creating and maintaining predictable, reproducible test data. Manual tests can use arbitrary data, but automated scripts require precise control over the state of data in the database or environment. The scale of applications, working with microservices, and privacy requirements have made the task of managing test data even more complex.
The problem:
Without controlled test data, tests become unstable, and their results are unrepresentative. It is common to encounter situations where:
Moreover, using real data can violate security or privacy policies.
Solutions:
Modern approaches include:
Key features:
Can real data from the production environment be used for automated tests?
No. This can lead to data leaks, violations of regulations, and instability of tests due to constant changes in the production system.
Will simply clearing all data between tests guarantee test stability?
No. It is important not only to clear data but also to prepare it correctly for the required state. Moreover, mass clearing can affect concurrently running tests or services.
Is having one test environment for all teams sufficient?
No, this leads to collisions and conflicts between tests from different teams. Isolated environments or containerization (Docker test suites, ephemeral environments) are optimal.
The testing team used a single test database, where both automated and manual tests ran. Often, automated tests failed due to manual data deletion or changes, leading to long debugging sessions and time losses.
Pros:
Cons:
The company implemented an infrastructure of ephemeral environments: each test ran on a separate copy of the database, deployed via Docker. Fixtures were automatically loaded using migration scripts.
Pros:
Cons: