DORA 2025 SPACE Framework 5-25 people

Pilot Your AI-Augmented Team

The metrics that actually matter when AI writes 70% of your code.

View the Framework → Full Guide →

When Velocity Lies

When AI accelerates delivery, old benchmarks break. PRs per day goes up, which looks like progress, while masking quality and skill problems underneath. You need different sensors.

+67%

PRs/day

Anthropic internal, Jan 2026 ↗

70–90%

AI-assisted code

Anthropic research, Aug 2025 ↗

DORA tiers

abandoned

DORA 2025 report ↗

4 Metric Categories

Each one covers a blind spot the others miss. Together they give you a complete picture.

Delivery Health

DORA

The baseline. Automate these first.

Deployment Frequency How often you ship to production
Lead Time for Changes Commit to production elapsed time
Change Failure Rate % of deploys causing incidents
MTTR Mean time to restore after failure

DORA Research ↗ Guide →

Quality Signal

Where AI hides its mistakes.

Bug Escape Rate Bugs found in prod per sprint
PR Review Comprehension % of PRs reviewed with genuine understanding
CI Speed P50/P90 Pipeline latency at median and 90th percentile

Product Impact

The layer most teams skip.

Time-to-Value Days from feature start to first user value
Feature Adoption (14-day) % of target users activating a feature within 2 weeks of release
CSAT on Key Features User satisfaction score on high-investment areas

Human Health

SPACE

What DORA doesn't see.

Developer Satisfaction Quarterly CSAT survey (5 questions, anonymous)
PR Review Time Avg hours from PR open to first review
Burnout Signals After-hours commits, PTO utilization, qualitative check-ins

SPACE Framework (ACM) ↗ Guide →

Start Small, Scale Right

The right metrics depend on your team size. Don't track what you can't act on.

5-person team

Deployment Frequency
Cycle Time
Time-to-Value
Bugs in prod/month
Quarterly satisfaction

Recommended tooling GitHub Insights + spreadsheet

Full breakdown in guide →

25-person team

All 4 DORA (automated)
Cycle Time per squad
Bug Escape Rate
AI contribution %
Quarterly satisfaction

Recommended tooling LinearB or Faros.ai

Full breakdown in guide →

The 4-Question Test

For any metric you're considering tracking, run it through this checklist. Fewer than 3 "yes" answers? Drop it. It's noise, not signal.

1

Can you act on it in <2 weeks?

2

Does it explain WHY, not just WHAT?

3

Is it correlated to a business outcome?

4

Can it be measured automatically?

Rule: fewer than 3 yes answers means the metric is not worth tracking. Tracking too much costs attention, the one resource AI can't augment.

Sources & References

Every data point on this page is traceable. Here are the primary sources.

DORA 2025 Report ↗ 7 archetypes, 8 dimensions: the new measurement model DORA Metrics Guide ↗ Official definitions for the 4 core delivery metrics SPACE Framework (ACM Queue) ↗ Forsgren et al., 5 dimensions of developer productivity Anthropic Contribution Metrics ↗ +67% PRs/day, Anthropic internal data, Jan 2026 Anthropic: How AI Transforms Work ↗ Internal study, 70-90% AI-assisted code across teams, Aug 2025

Read the full framework in the guide

Covers implementation playbook, tooling comparison, anti-patterns, and worked examples for squads already using Claude Code daily.

Open the Guide → Back to Home