System ArchitectureBackend Architect

How to properly organize the interaction layer between microservices considering the need for supporting transactionality?

Pass interviews with Hintsage AI assistant

Answer.

When designing interaction between microservices, the question of implementing distributed transactions often arises. In monoliths, there are sufficient built-in control mechanisms, but in microservices, the situation becomes complicated due to different databases, technologies, and data exchange formats. The main rule is to avoid long distributed transactions as they scale poorly and carry the risk of performance degradation.

Instead of classical ACID protocols, it is recommended to use the SAGA pattern. This is a chain of local transactions where each microservice records its changes and, if a rollback is necessary, sends a compensating operation within its responsibility.

An example of a simplified saga management implementation in Node.js with REST:

// Saga orchestrator using Express app.post('/saga', async (req, res) => { try { await serviceA.transaction(); await serviceB.transaction(); res.send('ok'); } catch (err) { await serviceA.rollback(); res.status(500).send('rollback performed'); } });

Key features:

  • SAGA minimizes long locks between services
  • Extensibility — new steps and compensating actions can be added
  • Suitable for the architecture where "each service has its own storage"

Tricky questions.

Can classical two-phase commits (2PC) be used between microservices to support transactions?

Yes, but it is not recommended for microservice architectures. 2PC slows down scaling and ties services together technologically and organizationally.

Are all SAGA scenarios suitable for financial transactions?

No. SAGA is harder to implement correctly for scenarios where strictness of ACID is critical, for example, recalculating balances on one account. It is important to assess risks and choose a compromise.

How is the failure of an intermediate step handled if a microservice cannot roll back?

In such cases, a manual or automated compensation protocol needs to be implemented, possibly using a separate event log.