History of the Issue
With the transition to microservices architectures and distributed systems, the likelihood of errors occurring during interaction between services, as well as the complexity of handling them, has sharply increased. Early approaches often overlooked the instability of network interaction, resulting in large-scale incidents in production.
The Problem
The key issue is that complex failure scenarios, service degradations, and integration errors are insufficiently formalized in requirements. As a result, developers are forced to make decisions about error handling at their discretion, leading to a diversity of cases and difficulties in testing them.
The Solution
Effective error handling descriptions should include:
Key Features:
Is it mandatory to describe technical error handling in requirements — isn’t this the developer's task?
It is mandatory. An unreflected error-handling policy often leads to operational errors and misunderstandings. A system analyst must discuss the behavior in case of errors.
Should cases that occur very rarely (e.g., partial loss of communication between services) be described?
Yes, because rarely occurring errors lead to the most complex incidents. Their consequences can be critical for the business.
Is it required to coordinate with the business the messages displayed to users in case of errors?
Yes. Correct, informative, but not excessive or alarming messages should be coordinated with the business; otherwise, the user experience and loyalty suffer.
Negative case: The project did not describe timeout handling scenarios between services. As a result of an unstable network, services "hung" without a response. Pros: Fast execution of main scenarios. Cons: Massive failures in production, negative feedback from clients, "manual" closure of incidents.
Positive case: The analyst described degradation and restart scenarios, retries, and correct messages. Pros: High service stability during failures, reduced number of incidents. Cons: More time spent on scenario architecture development.