SLI / SLO / SLA Definitions

Term	Definition	Example
SLI (Indicator)	The measurable metric that indicates service health	Request success rate, latency p99, error rate
SLO (Objective)	Target value for the SLI over a time window	99.9% availability over 30 days
SLA (Agreement)	Contractual commitment — consequences for missing SLO	99.9% uptime; 10% service credit if below
Error Budget	1 - SLO = allowable downtime/errors	99.9% SLO = 43.8 min/month budget

Common SLIs

Service Type	Key SLIs
Request/Response (API)	Availability (2xx/total), latency p99, error rate
Data Pipeline	Freshness (time since last successful run), correctness
Storage	Durability (data loss rate), read/write availability, latency
Batch Processing	Throughput, completion rate, success rate

Error Budget Calculation

# SLO: 99.9% availability over 30 days
Error Budget = (1 - 0.999) × 30 × 24 × 60 = 43.2 minutes

# Current burn rate
Burn Rate = (Error Rate / (1 - SLO)) × (window / SLO window)

# Alert: fast burn (last 1h burning 2% of monthly budget)
Fast Burn Alert: burn_rate > 14.4 for 1h
  → page on-call

# Alert: slow burn (6h window)
Slow Burn Alert: burn_rate > 6 for 6h
  → create ticket

Availability Numbers

Availability	Downtime/Year	Downtime/Month	Downtime/Week
99% (two nines)	3.65 days	7.31 hours	1.68 hours
99.9% (three nines)	8.77 hours	43.8 min	10.1 min
99.95%	4.38 hours	21.9 min	5.04 min
99.99% (four nines)	52.6 min	4.38 min	1.01 min
99.999% (five nines)	5.26 min	26.3 sec	6.05 sec

SRE Practice Guide

SLI / SLO / SLA Definitions

Common SLIs

Error Budget Calculation

Availability Numbers

Related Tools