Designing an SLA (Service Level Agreement) in the system architecture is assigning controlled, measurable, and monitorable quality performance indicators of services. At the architectural design stage, key SLA parameters and technical mechanisms for measuring them are determined.
Basic steps:
Example of defining SLA for a web service:
Key features:
Can SLA be built solely on technical metrics (e.g., errors and response)?
Answer: Incorrect. It is also necessary to consider business metrics (e.g., the success of business operations) to ensure that the SLA meets business expectations.
Is achieving SLA a static process that does not require adjustments after the system is launched?
Answer: No. SLA is revised with changes in the business, increased load, and new requirements.
Can SLA monitoring be based solely on results from external systems (ping, http-check) without agents inside the services?
Answer: Not recommended. External monitoring is important, but internal gathering (agents collecting internal metrics) allows detecting hidden issues before they become noticeable externally.