What It Does

DORA Metrics are four quantitative measures of software delivery performance derived from the DevOps Research and Assessment (DORA) program, now part of Google Cloud. The program has surveyed 33,000+ practitioners over 10+ years, establishing statistical correlations between these metrics and organizational outcomes (revenue, customer satisfaction, profitability).

The four metrics:

Deployment Frequency: How often an organization successfully releases to production (Elite: multiple times per day; Low: less than once per six months).
Lead Time for Changes: Time from a code commit to that commit running in production (Elite: less than one hour; Low: six months to one year).
Change Failure Rate: Percentage of deployments causing a service impairment or requiring rollback (Elite: 0–5%; Low: 46–60%).
Time to Restore Service (MTTR): How quickly a service can be restored after an incident (Elite: less than one hour; Low: one to six months).

A fifth metric — Reliability (meeting SLO targets) — was added in 2021. The SPACE framework (2021, from Nicole Forsgren and others) extends DORA with satisfaction, performance, activity, communication, and efficiency dimensions, providing a more holistic developer productivity picture.

Key Features

Evidence-based benchmarking: Teams can compare their metrics against DORA performance tiers (Elite, High, Medium, Low) to identify improvement areas relative to industry peers.
Outcome correlation: DORA research shows Elite performers have 127x more frequent deployments, 6570x faster lead time, 7x lower change failure rate, and 2604x faster recovery than Low performers.
Toolchain-agnostic measurement: Metrics can be derived from any CI/CD system, incident management tool, and version control system — not tied to any specific vendor.
Actionable directives: Each metric maps to specific technical practices (continuous integration, trunk-based development, feature flags, automated testing, incident management) that demonstrably improve the metric.
SPACE framework extension: Adds qualitative dimensions (developer satisfaction, activity proxies) beyond pure delivery throughput to capture developer experience.

Use Cases

Engineering leadership benchmarking: CTOs and VPs Engineering using DORA tiers to set quarterly improvement targets and track progress toward Elite performance.
Platform team ROI justification: Platform engineering teams demonstrating that investing in CI/CD automation, GitOps, and internal tooling improves deployment frequency and lead time — justifying headcount.
Post-incident analysis framing: SRE and reliability teams using Time to Restore as a structured metric for incident retrospectives and on-call tooling investment.
Acquisition or due diligence: Engineering due diligence processes using DORA metrics as a proxy for delivery organization health.

Adoption Level Analysis

Small teams (<20 engineers): Low-to-moderate fit. DORA metrics are most meaningful with enough deployment volume to be statistically significant. A 5-person team deploying weekly has too few data points to distinguish signal from noise. Manual tracking in a spreadsheet is often sufficient at this scale. Focus on deployment frequency and lead time; skip MTTR until you have formal incident management.

Medium orgs (20–200 engineers): Good fit. Multiple teams with regular deployments produce meaningful data. Integration with CI/CD and incident management tools (PagerDuty, Jira) enables automated tracking. Tools like LinearB, Swarmia, Jellyfish, or Harness SEI can automate collection. Risk: treating the metric as the goal rather than the outcome (Goodhart’s Law).

Enterprise (200+ engineers): Strong fit. Enterprises with engineering leadership accountability structures use DORA metrics for team-level performance reviews, portfolio investment decisions, and acquisition integration benchmarks. At this scale, automated tooling (Harness SEI, LinearB, Jellyfish) is necessary — manual collection is impractical.

Alternatives

Alternative	Key Difference	Prefer when…
SPACE Framework	Broader developer productivity view including satisfaction and communication	Want a richer picture beyond delivery throughput; Google/Microsoft research-backed
Flow Framework (Mik Kersten)	Business value flow view: features, defects, risk, debt percentages	Connecting engineering metrics to business outcomes at product portfolio level
Accelerate Book metrics	Same as DORA but original academic framing (Forsgren/Humble/Kim)	Academic or research context; prefer the book-form framework
Custom engineering dashboards	Bespoke metrics tailored to org-specific goals	Standard metrics do not capture the organization’s unique constraints

Evidence & Sources

Notes & Caveats

Goodhart’s Law risk: When DORA metrics become management targets, teams optimize the metric rather than the underlying practice. Deployment frequency can be gamed by splitting trivial commits; MTTR can be gamed by closing incidents prematurely. Metrics must be paired with qualitative review.
Lead time measurement ambiguity: Different tools measure lead time differently — first commit, PR open, PR merge, or deploy trigger. Without a consistent definition across teams, benchmarks are incomparable.
2025 DORA Report AI finding: The 2025 DORA Report found that AI coding tools amplify existing team capability — strong teams benefit, struggling teams deteriorate further. This suggests DORA metrics alone are insufficient for assessing AI tooling ROI.
Tooling vendor lock-in: Commercial DORA tools (Harness SEI, LinearB, Jellyfish, Swarmia) collect and normalize metrics across toolchains but build proprietary data warehouses. Migrating between vendors requires re-connecting integrations and loses historical data.
MTTR vs MTTD distinction: Time to Restore measures from incident detection to resolution. Many teams conflate MTTR with Mean Time to Detection (MTTD). MTTD is often longer and more actionable for observability investment decisions but is not a core DORA metric.

DORA Metrics

At a Glance