Answer.

Background:

Smoke tests originally emerged as a quick way to ensure that the most basic functionality of a system works after deployment or code changes. Their ideology is "if something critical is broken, there's no point in running detailed tests". In automation, the first smoke tests were implemented as small manual scripts to check application launch, access to the login screen, and basic actions.

Problem:

The main challenges in automating smoke tests are properly isolating the minimal set of scenarios, high execution speed, minimal dependence on unstable components (e.g., third-party services), and providing both visual and technical support for their "lightness and transparency". If this is not done, smoke automation either becomes too heavy or frequently generates false positives and requires significant maintenance.

Solution:

Minimize the number of smoke tests: they should only include checks for the most critical "entry points" (e.g., authorization, launching the main module, database availability).
Move unstable steps and external dependencies outside of smoke scenarios or stabilize environments with "stubs".
Use tagging (@smoke, Suite('smoke'), etc.) and separate sections in the CI/CD pipeline to run smoke tests first.

Key features:

Smoke scenarios should execute quickly and only use the most stable part of the infrastructure.
Smoke class automated tests should not cover UX details or complex workflows.
Smoke automation requires strict control over dependencies and minimal support code.

Trick questions.

Can edge-case logic checks be added to the smoke suite?

No, the smoke suite is intended solely for checking the viability and accessibility of the core system; edge cases are unnecessary here, as they will slow down execution and complicate maintenance.

Is multi-level error handling and recovery needed in smoke tests?

It is often mistakenly believed that complex recovery mechanisms are needed in smoke tests. In reality, if a smoke test fails, it signals a critical issue that needs to be fixed, not "worked around" in the test.

Should smoke tests depend on data left by other tests?

No, smoke tests should not depend on any external test data, let alone artifacts from other tests. This is one of the key principles of their reliability.

Common mistakes and anti-patterns

Overloading smoke tests: too many scenarios, turning them into regression checks.
Code duplication between smoke and regular automated tests.
Implicit dependencies: the test uses "dirty" data/artifacts from other scenarios.

Real-life example

Negative case

In the smoke suite, we added 30 different checks, some of which tested not only system launch but also complex algorithms and edge-case conditions. The execution of smoke tests began to take 30 minutes, and periodically some checks failed due to backend instability.

Pros:

Easy to identify bottlenecks in the system.
High test coverage immediately after deployment.

Cons:

The essence of smoke testing was lost: a "green" build can only be obtained after long wait times and resolving non-critical issues for system production rollout.
Difficult to maintain tests and highlight real critical failures.

Positive case

We stripped down the smoke group to a strict minimum: login, opening the main page, database query, basic API handshake. The smoke framework operates independently from the main test matrix and always runs first in the CI/CD pipeline. Results are available in a separate chat for quick diagnostics.