Calculator
Choose a method, set your measurement period, and include the reliability options you track.
Formula used
T is the denominator window, and D is effective downtime.How to use this calculator
Example data table
A compact reference table for common SLA levels and their typical allowed downtime.
| SLA (%) | Allowed downtime per year | Allowed downtime per 30-day month | Allowed downtime per week |
|---|---|---|---|
| 99.000 | 3 days, 15 hours, 36 minutes | 7 hours, 12 minutes | 1 hour, 41 minutes |
| 99.500 | 1 day, 19 hours, 48 minutes | 3 hours, 36 minutes | 50 minutes |
| 99.900 | 8 hours, 46 minutes | 43 minutes | 10 minutes |
| 99.950 | 4 hours, 23 minutes | 22 minutes | 5 minutes |
| 99.990 | 53 minutes | 4 minutes | 1 minute |
| 99.999 | 5 minutes | 0 minutes | 0 minutes |
| Window | Unplanned downtime | Planned maintenance | Maintenance excluded | Computed availability |
|---|---|---|---|---|
| 30-day month | 42 minutes (impact 100%) | 60 minutes | Yes | 99.90278% |
Availability, Uptime, and the Denominator Window
Availability is calculated from a time window T and effective downtime D. For a 30‑day month, T is typically 43,200 minutes. If your SLA excludes planned maintenance, subtract maintenance from T so you do not penalize approved change windows. This calculator shows both the total window and the denominator window, which helps align reliability reports with contract language and on‑call expectations.
Impact Weighting for Partial Outages
Not every incident is a full outage. A degraded API, regional brownout, or capacity drop can be modeled with an impact percentage. Effective downtime becomes D_effective = downtime × impact. For example, 40 minutes at 60% impact counts as 24 effective minutes. Using impact keeps the availability number honest while still reflecting customer experience when some traffic succeeds or service remains partially usable.
Reliability Inputs: MTBF, MTTD, MTTR
When you have engineering reliability data, use the MTBF/MTTD/MTTR method. Node availability is A = MTBF / (MTBF + MTTD + MTTR). Lowering detection time often improves availability as much as speeding repairs, because MTTD adds directly to restoration time. Track MTTR for both automated recovery and human intervention, then compare modeled results to observed downtime to validate assumptions.
Redundancy and k‑of‑n Service Requirements
Many platforms survive single failures through redundancy. The k‑of‑n model estimates system availability from identical component availability. Active‑active pairs commonly map to k=1, n=2, while quorum systems may be k=2, n=3. If A_node is 99.0%, a k=1, n=2 pool yields about 99.99% availability, because the service is down only when both components fail simultaneously.
SLA Targets and Error Budgets for Engineering Decisions
SLA targets convert quickly into an “error budget” that guides release velocity and risk. Allowed downtime is (1 − SLA) × T, and remaining budget is allowed minus effective downtime. If a team consumes budget early in a month, freeze risky changes, reduce blast radius, and improve monitoring. If budget stays healthy, you can ship faster while still meeting contractual expectations. Use the downtime equivalents table to translate a percentage into minutes per week, month, and year for stakeholder communication and planning.
FAQs
Uptime is time the system is running. Availability is the fraction of the agreed window where users can successfully use the service, including partial outages and SLA exclusions.
Only if your contract or SLO includes it. If maintenance is excluded, subtract planned minutes from the denominator so change windows do not reduce the reported availability.
Use downtime when you have measured outage minutes from monitoring or incident reviews. Use reliability when you want a forward-looking estimate from MTBF, detection time, and repair time, especially during design or capacity planning.
Impact scales downtime to represent degraded service. A 50% impact means each minute counts as half a minute of effective downtime, which changes both availability and remaining error budget.
It converts availability into a shorthand using −log10(1−A). It is approximate because it ignores traffic shape, correlated failures, and distribution tails, but it helps compare systems with different uptime percentages quickly.
Yes. Use k-of-n with n regions or nodes and set k to the minimum that must be healthy to serve traffic. For asymmetric regions, run scenarios per region and compare results using the export outputs.