Skip to content
Code Guide

Development Methodologies Reference

Confidence: Tier 2 — Validated by multiple production reports and official documentation.

Last updated: February 2026

This is a quick reference for 15 structured development methodologies that have emerged for AI-assisted development in 2025-2026. For hands-on practical workflows, see workflows/.


  1. Decision Tree
  2. The 15 Methodologies
  3. SDD Tools Reference
  4. Writing Effective Specs
  5. Combination Patterns
  6. Sources

┌─ "I want quality code" ────────────→ workflows/tdd-with-claude.md
├─ "I want to spec before code" ─────→ workflows/spec-first.md
├─ "I need to plan architecture" ────→ workflows/plan-driven.md
├─ "I'm iterating on something" ─────→ workflows/iterative-refinement.md
└─ "I need methodology theory" ──────→ Continue reading below

Organized in a 6-tier pyramid from strategic orchestration down to optimization techniques.

NameWhatBest ForClaude Fit
BMADMulti-agent governance with constitution as guardrailEnterprise 10+ teams, long-term projects⭐⭐ Niche but powerful
GSDMeta-prompting 6-phase workflow with fresh contexts per taskSolo devs, Claude Code CLI⭐⭐ Similar to patterns in guide

BMAD (Breakthrough Method for Agile AI-Driven Development) inverts the traditional paradigm: documentation becomes the source of truth, not code. Uses specialized agents (Analyst, PM, Architect, Developer, QA) orchestrated with strict governance. Note: BMAD’s role-based agent naming reflects their methodology; see §9.17 Agent Anti-Patterns for scope-focused alternatives.

  • Key concept: Constitution.md as strategic guardrail
  • When to use: Complex enterprise projects needing governance
  • When to avoid: Small teams, MVPs, rapid prototyping

GSD (Get Shit Done) addresses context rot through systematic 6-phase workflow (Initialize → Discuss → Plan → Execute → Verify → Complete) with fresh 200k-token contexts per task. Core concepts (multi-agent orchestration, fresh context management) overlap significantly with existing patterns like Ralph Loop, Gas Town, and BMAD. See resource evaluation for detailed comparison.

Emerging: Ralph Inferno implements autonomous multi-persona workflows (Analyst→PM→UX→Architect→Business) with VM-based execution and self-correcting E2E loops. Experimental but interesting for “vibe coding at scale”.


Foundational Discipline: Plan-First Workflow

Section titled “Foundational Discipline: Plan-First Workflow”

“Once the plan is good, the code is good.” — Boris Cherny, creator of Claude Code

Not just a feature (/plan command) — a systematic discipline.

Context Engineering: Thoughtworks designates this broader approach “Context Engineering” in their Technology Radar (Nov 2025)1 — the systematic design of information provided to LLMs during inference. Three core techniques: context setup (minimal system prompts, few-shot examples), context management for long-horizon tasks (summarization, external memories, sub-agent architectures), and dynamic information retrieval (JIT context loading). Related patterns in Claude Code: AGENTS.md, MCP Context7, Plan Mode.

The Mental Model:

Planning isn’t optional for complex tasks. It’s the difference between:

  • ❌ 8 iterations of “try → fix → retry → fix again”
  • ✅ 1 iteration of “plan → validate → execute cleanly”

When to plan first:

Task ComplexityPlan First?Why
>3 files modified✅ YesCross-file dependencies need architecture
>50 lines changed✅ YesEnough complexity for mistakes
Architectural changes✅ YesImpact analysis required
Unfamiliar codebase✅ YesNeed exploration before action
Typo/obvious fix❌ NoPlanning overhead > task time
Single-line change❌ NoJust do it

How plan-first works:

  1. Exploration phase (Plan Mode via Shift+Tab):

    • Claude reads files, explores architecture
    • No edits allowed → forces thinking before action
    • Proposes approach with trade-offs
  2. Validation phase (you review):

    • Plan exposes assumptions and gaps
    • Easier to correct direction now vs after 100 lines written
    • Plan becomes contract for execution
  3. Execution phase (toggle back to Normal Mode with Shift+Tab):

    • Plan → code becomes mechanical translation
    • Fewer surprises, cleaner implementation
    • Faster overall despite “slower” start

Boris Cherny workflow:

“I run many sessions, start in plan mode, then switch into execution once the plan looks right. The signature upgrade is verification—giving Claude a way to test and confirm its own output.”

Benefits over “just start coding”:

  • Fewer correction iterations: Plan catches issues before they become code
  • Better architecture: Forced to think about structure first
  • Clearer communication: Plan is shared understanding with team/Claude
  • Reduced cost: One clean iteration < multiple messy iterations (even if plan phase costs tokens)

Integration with CLAUDE.md:

Document your team’s plan-first triggers:

## Planning Policy
- ALWAYS plan first: API changes, database migrations, new features
- OPTIONAL planning: Bug fixes <10 lines, test additions
- NEVER skip: Changes affecting >2 modules

See also: Plan Mode documentation for /plan command usage.

Advanced pattern: For an iterative annotation-based approach to plan-driven development, see Custom Markdown Plans (Boris Tane Pattern).


NameWhatBest ForClaude Fit
SDDSpecs before codeAPIs, contracts⭐⭐⭐ Core pattern
Doc-DrivenDocs = source of truthCross-team alignment⭐⭐⭐ CLAUDE.md native
Req-DrivenRich artifact context (20+ artifacts)Complex requirements⭐⭐ Heavy setup
DDDDomain language firstBusiness logic⭐⭐ Design-time

SDD (Spec-Driven Development) — Specifications BEFORE code. One well-structured iteration equals 8 unstructured ones. CLAUDE.md IS your spec file.

Doc-Driven Development — Living documentation versioned in git becomes the single source of truth. Changes to specs trigger implementation.

Requirements-Driven Development — Uses CLAUDE.md as comprehensive implementation guide with 20+ structured artifacts.

DDD (Domain-Driven Design) — Aligns software with business language through:

  • Ubiquitous Language: Shared vocabulary in code
  • Bounded Contexts: Isolated domain boundaries
  • Domain Distillation: Core vs Support vs Generic domains

NameWhatBest ForClaude Fit
BDDGiven-When-Then scenariosStakeholder collaboration⭐⭐⭐ Tests & specs
ATDDAcceptance criteria firstCompliance, regulated⭐⭐ Process-heavy
CDDAPI contracts as interfaceMicroservices⭐⭐⭐ OpenAPI native

BDD (Behavior-Driven Development) — Beyond testing: a collaboration process.

  1. Discovery: Involve devs and business experts
  2. Formulation: Write Given-When-Then examples
  3. Automation: Convert to executable tests (Gherkin/Cucumber)
Feature: Order Management
Scenario: Cannot buy without stock
Given product with 0 stock
When customer attempts purchase
Then system refuses with error message

ATDD (Acceptance Test-Driven Development) — Acceptance criteria defined BEFORE coding, collaboratively (“Three Amigos”: Business, Dev, Test).

In agentic development, ATDD is particularly effective because agents need unambiguous success conditions. The flow maps cleanly to agent tasks:

  1. Define acceptance criteria in Gherkin (human-readable, machine-executable)
  2. Agent writes failing tests based on scenarios (not implementation)
  3. Agent implements until tests pass
Feature: Password Reset
Scenario: User resets via email
Given a registered user with email "user@example.com"
When they request a password reset
Then they receive a reset email within 60 seconds
And the reset link expires after 24 hours

This Gherkin scenario is the contract between intent and implementation. The agent cannot misinterpret scope because done is defined before a line of code is written.

Applied to agents: Pass the Gherkin file to Claude Code before implementing. “Write failing tests for this feature file, then implement until they pass.” The scenario writer role (human or agent) forces explicit scope before execution starts.

CDD (Contract-Driven Development) — API contracts (OpenAPI specs) as executable interface between teams. Patterns: Contract as Test, Contract as Stub.


NameWhatBest ForClaude Fit
FDDFeature-by-feature deliveryLarge teams 10+⭐⭐ Structure
Context Eng.Context as first-class designLong sessions⭐⭐⭐ Fundamental

FDD (Feature-Driven Development) — Five processes:

  1. Develop Overall Model
  2. Build Features List
  3. Plan by Feature
  4. Design by Feature
  5. Build by Feature

Strict iteration: 2 weeks max per feature.

Context Engineering — Treat context as design element:

  • Progressive Disclosure: Let agent discover incrementally
  • Memory Management: Conversation vs persistent memory
  • Dynamic Refresh: Rewrite TODO list before response

NameWhatBest ForClaude Fit
TDDRed-Green-RefactorQuality code⭐⭐⭐ Core workflow
Eval-DrivenEvals for LLM outputsAI products⭐⭐⭐ Agents
Multi-AgentOrchestrate sub-agentsComplex tasks⭐⭐⭐ Task tool

TDD (Test-Driven Development) — The classic cycle:

  1. Red: Write failing test
  2. Green: Minimal code to pass
  3. Refactor: Clean up, tests stay green

With Claude: Be explicit. “Write FAILING tests that don’t exist yet.”

Verification Loops — A formalized pattern for autonomous iteration (broader than TDD):

Core principle: Give Claude a mechanism to verify its own output.

Code generated → Verification tool → Feedback loop → Improvement

Why it works (Boris Cherny): “An agent that can ‘see’ what it has done produces better results.”

Verification mechanisms by domain:

DomainVerification ToolWhat Claude “Sees”
FrontendBrowser preview (live reload)Visual rendering, layout, interactions
BackendTests (unit/integration)Pass/fail status, error messages
TypesTypeScript compilerType errors, incompatibilities
StyleLinters (ESLint, Prettier)Style violations, formatting issues
PerformanceProfilers, benchmarksExecution time, memory usage
Accessibilityaxe-core, screen readersWCAG violations, navigation issues
SecurityStatic analyzers (Semgrep)Vulnerability patterns
UXUser testing, recordingsUsability problems, confusion points

TDD as canonical example:

  1. Claude writes tests for the feature
  2. Claude iterates code until tests pass
  3. Continue until explicit completion criteria met

Official guidance: “Tell Claude to keep going until all tests pass. It will usually take a few iterations.”Anthropic Best Practices

Implementation patterns:

  • Hooks: PostToolUse hook runs verification after each edit
  • Browser extension: Claude in Chrome sees rendered output
  • Test watchers: Jest/Vitest watch mode provides instant feedback
  • CI/CD gates: GitHub Actions runs full validation suite
  • Multi-Claude verification: One Claude codes, another reviews

Anti-pattern: Blind iteration without feedback. Without verification mechanism, Claude can’t converge toward correct solution—it guesses.

Eval-Driven Development — TDD for LLMs. Test agent behaviors via evals:

  • Code-based: output == golden_answer
  • LLM-based: Another Claude evaluates
  • Human grading: Reference, slow

Eval Harness — The infrastructure that runs evaluations end-to-end: providing instructions and tools, running tasks concurrently, recording steps, grading outputs, and aggregating results.

See Anthropic’s comprehensive guide: Demystifying Evals for AI Agents

Multi-Agent Orchestration — From single assistant to orchestrated team:

Meta-Agent (Orchestrator)
├── Analyst (requirements)
├── Architect (design)
├── Developer (code)
└── Reviewer (validation)

Pattern: Write plain English ADRs → Feed to implement-adr skill → Execute natively

Architecture Decision Records (ADRs) combined with Claude Code skills create a workflow where architectural decisions drive implementation directly.

Workflow Steps:

  1. Document decision in ADR format (context, decision, consequences)
  2. Create implementation skill (generic or implement-adr specialized)
  3. Feed ADR as prompt to skill with clear acceptance criteria
  4. Claude executes based on architectural guidance in ADR

Example ADR Template:

# ADR-001: Database Migration Strategy
## Context
Legacy MySQL schema needs migration to PostgreSQL for better JSON support.
## Decision
Use incremental dual-write pattern with feature flags.
## Consequences
- Positive: Zero-downtime migration
- Negative: Temporary code complexity during transition

Implementation Workflow:

Terminal window
# 1. Write ADR (plain English)
vim docs/adr/001-database-migration.md
# 2. Feed to implementation skill
/implement-adr docs/adr/001-database-migration.md
# 3. Claude executes based on ADR guidance
# → Creates migration scripts
# → Updates ORM configuration
# → Adds feature flags
# → Implements dual-write logic

Benefits:

  • Documentation-driven: Architecture and code stay synchronized
  • Native execution: No external frameworks needed
  • Traceable decisions: Clear audit trail from decision to implementation
  • Team alignment: ADRs communicate intent to both humans and AI

Source: Gur Sannikov embedded engineering workflow


NameWhatBest ForClaude Fit
Iterative LoopsAutonomous refinementOptimization⭐⭐⭐ Core
Fresh ContextReset per task, state in filesLong autonomous sessions⭐⭐⭐ Power users
Prompt EngineeringTechnique foundationEverything⭐⭐⭐ Prerequisite

Iterative Refinement Loops — Autonomous convergence:

  1. Execute prompt
  2. Observe result
  3. If result ≠ “DONE” → refine and repeat

Prompt Engineering — Foundations for ALL Claude usage:

  • Zero-Shot Chain of Thought: “Think step by step”
  • Few-Shot Learning: 2-3 examples of expected pattern
  • Structured Prompts: XML tags for organization
  • Position Matters: For long docs, place question at end

Fresh Context Pattern (Ralph Loop) — Solves context rot by spawning fresh agent instances per task. State persists in git + progress files, not chat history. Ideal for long autonomous sessions (migrations, overnight runs). See Ultimate Guide - Fresh Context Pattern for implementation.


Three tools have emerged to formalize Spec-Driven Development:

ToolUse CaseOfficial DocsClaude Integration
Spec KitGreenfield, governancegithub.blog/spec-kit/speckit.constitution, /speckit.specify, /speckit.plan
OpenSpecBrownfield, changesgithub.com/Fission-AI/OpenSpec/openspec:proposal, /openspec:apply, /openspec:archive
SpecmaticAPI contract testingspecmatic.ioMCP agent available
Spec-to-Code FactoryGreenfield, enforcement outillégithub.com/SylvainChabaud/spec-to-code-factoryImplémentation référence multi-agents (BREAK→MODEL→ACT→DEBRIEF)

5-phase workflow:

  1. Constitution: /speckit.constitution → guardrails
  2. Specify: /speckit.specify → requirements
  3. Plan: /speckit.plan → architecture
  4. Tasks: /speckit.tasks → decomposition
  5. Implement: /speckit.implement → code

Two-folder architecture:

openspec/
├── specs/ ← Current truth (stable)
└── changes/ ← Proposals (temporary)

Workflow: Proposal → Review → Apply → Archive

  • Contract as Test: Auto-generates 1000s of tests from OpenAPI spec
  • Contract as Stub: Mock server for parallel development
  • Backward Compatibility: Detects breaking changes

Based on analysis of 2,500+ agent configuration files. Source: Addy Osmani

ComponentWhat to IncludeExample
CommandsExecutable with flagsnpm test -- --coverage
TestingFramework, coverage, locationsvitest, 80%, tests/
Project structureExplicit directoriessrc/, lib/, tests/
Code styleOne example > paragraphsShow a real function
Git workflowBranch, commit, PR formatfeat/name, conventional commits
BoundariesPermission tiersSee below
TierSymbolUse For
Always doSafe actions, no approval (lint, format)
Ask first⚠️High-impact changes (delete, publish)
Never do🚫Hard stops (commit secrets, force push main)

⚠️ Research shows more instructions = worse adherence to each one.

Solution: Feed only relevant spec sections per task, not the entire document.

Project SizeApproach
Small (<10 files)Single spec file
Medium (10-50 files)Sectioned spec, feed per task
Large (50+ files)Sub-agent routing by domain

Recommended stacks by situation:

SituationRecommended StackNotes
Solo MVPSDD + TDDMinimal overhead, quality focus
Team 5-10, greenfieldSpec Kit + TDD + BDDGovernance + quality + collaboration
MicroservicesCDD + SpecmaticContract-first, parallel dev
Existing SaaS (100+ features)OpenSpec + BDDChange tracking, no spec drift
Enterprise 10+BMAD + Spec Kit + SpecmaticFull governance + contracts
LLM-native productEval-Driven + Multi-AgentSelf-improving systems

MethodologyLevelPrimary FocusTeam SizeLearning Curve
BMADOrchestrationGovernance10+High
SDDSpecificationContractsAnyMedium
Doc-DrivenSpecificationAlignmentAnyLow
Req-DrivenSpecificationContext5+Medium
DDDSpecificationDomain5+Very High
BDDBehaviorCollaboration5+Medium
ATDDBehaviorCompliance5+Medium
CDDBehaviorAPIs5+Medium
FDDDeliveryFeatures10+Medium
Context Eng.DeliveryAI sessionsAnyLow
TDDImplementationQualityAnyLow
Eval-DrivenImplementationAI outputsAnyMedium
Multi-AgentImplementationComplexityAnyMedium
IterativeOptimizationRefinementAnyLow
Prompt Eng.OptimizationFoundationAnyVery Low

SDD & Spec-First

BMAD

TDD with AI

BDD & DDD

Context Engineering

Eval-Driven & Multi-Agent


  1. Thoughtworks Technology Radar Vol 33, Nov 2025. PDF. See also: Macro trends blog post.