2. Core Workflow

📌 Section 2 TL;DR (2 minutes)

What you’ll learn: The mental model and critical workflows for Claude Code mastery.

Key Concepts:

Interaction Loop: Describe → Analyze → Review → Accept/Reject cycle
Context Management 🔴 CRITICAL: Watch Ctx(u): — /compact at 70%, /clear at 90%
Plan Mode: Read-only exploration before making changes
Rewind: Undo with Esc×2 or /rewind
Mental Model: Claude = expert pair programmer, not autocomplete

The One Rule:

Always check context % before starting complex tasks. High context = degraded quality.

Read this section if: You want to avoid the #1 mistake (context overflow) Skip if: You just need quick command reference (go to Section 10)

Reading time: 20 minutes

Skill level: Day 1-3

Goal: Understand how Claude Code thinks

2.1 The Interaction Loop

Every Claude Code interaction follows this pattern:

┌─────────────────────────────────────────────────────────┐
│                    INTERACTION LOOP                     │
├─────────────────────────────────────────────────────────┤
│                                                         │
│   1. DESCRIBE  ──→  You explain what you need           │
│        │                                                │
│        ▼                                                │
│   2. ANALYZE   ──→  Claude explores the codebas         │
│        │                                                 │
│        ▼                                                 │
│   3. PROPOSE   ──→  Claude suggests changes (diff)       │
│        │                                                 │
│        ▼                                                 │
│   4. REVIEW    ──→  You read and evaluate                │
│        │                                                 │
│        ▼                                                 │
│   5. DECIDE    ──→  Accept / Reject / Modify             │
│        │                                                 │
│        ▼                                                 │
│   6. VERIFY    ──→  Run tests, check behavior            │
│        │                                                 │
│        ▼                                                 │
│   7. COMMIT    ──→  Save changes (optional)              │
│                                                          │
└─────────────────────────────────────────────────────────┘

Key Insight

The loop is designed so that you remain in control. Claude proposes, you decide.

2.2 Context Management

🔴 This is the most important concept in Claude Code.

📌 Context Management Quick Reference

The zones:

🟢 0-50%: Work freely
🟡 50-75%: Be selective
🔴 75-90%: /compact now
⚫ 90%+: /clear required

When context is high:

/compact (saves context, frees space)
/clear (fresh start, loses history)

Prevention: Load only needed files, compact regularly, commit frequently

What is Context?

Context is Claude’s “working memory” for your conversation. It includes:

All messages in the conversation
Files Claude has read
Command outputs
Tool results

The Context Budget

Claude has a 200,000 token context window. Think of it like RAM - when it fills up, things slow down or fail.

Reading the Statusline

The statusline shows your context usage:

Claude Code │ Ctx(u): 45% │ Cost: $0.23 │ Session: 1h 23m

Metric	Meaning
`Ctx(u): 45%`	You’ve used 45% of context
`Cost: $0.23`	API cost so far
`Session: 1h 23m`	Time elapsed

Custom Statusline Setup

The default statusline can be enhanced with more detailed information like git branch, model name, and file changes.

Option 1: ccstatusline (recommended)

Add to ~/.claude/settings.json:

{
  "statusLine": {
    "type": "command",
    "command": "npx -y ccstatusline@latest",
    "padding": 0
  }
}

Option 2: Custom script

Create your own script that:

Reads JSON data from stdin (model, context, cost, git info)
Outputs a single formatted line to stdout
Supports ANSI colors for styling

{
  "statusLine": {
    "type": "command",
    "command": "/path/to/your/statusline-script.sh",
    "padding": 0
  }
}

Use /statusline command in Claude Code to auto-generate a starter script.

Context Zones

Zone	Usage	Action
🟢 Green	0-50%	Work freely
🟡 Yellow	50-75%	Start being selective
🔴 Red	75-90%	Use `/compact` or `/clear`
⚫ Critical	90%+	Must clear or risk errors

Context Recovery Strategies

When context gets high:

Option 1: Compact (/compact)

Summarizes the conversation
Preserves key context
Reduces usage by ~50%

Option 2: Clear (/clear)

Starts fresh
Loses all context
Use when changing topics

Option 3: Summarize from here (v2.1.32+)

Use /rewind (or Esc + Esc) to open the checkpoint list
Select a checkpoint and choose “Summarize from here”
Claude summarizes everything from that point forward, keeping earlier context intact
Frees space while keeping critical context
More precise than full /compact

Option 4: Targeted Approach

Be specific in queries
Avoid “read the entire file”
Use symbol references: “read the calculateTotal function”

Context Triage: What to Keep vs. Evacuate

When approaching the red zone (75%+), /compact alone may not be enough. You need to actively decide what information to preserve before compacting.

Priority: Keep

Keep	Why
CLAUDE.md content	Core instructions must persist
Files being actively edited	Current work context
Tests for the current component	Validation context
Critical decisions made	Architectural choices
Error messages being debugged	Problem context

Priority: Evacuate

Evacuate	Why
Files read but no longer relevant	One-time lookups
Debug output from resolved issues	Historical clutter
Long conversation history	Summarized by /compact
Files from completed tasks	No longer needed
Large config files	Can be re-read if needed

Pre-Compact Checklist:

Document critical decisions in CLAUDE.md or a session note
Commit pending changes to git (creates restore point)
Note the current task explicitly (“We’re implementing X”)
Run /compact to summarize and free space

Pro tip: If you know you’ll need specific information post-compact, tell Claude explicitly: “Before we compact, remember that we decided to use Strategy A for authentication because of X.” Claude will include this in the summary.

Session vs. Persistent Memory

Claude Code has three distinct memory systems. Understanding the difference is crucial for effective long-term work:

Aspect	Session Memory	Auto-Memory (native)	Persistent Memory (Serena)
Scope	Current conversation only	Across sessions, per-project	Across all sessions
Managed by	`/compact`, `/clear`	`/memory` command (automatic)	`write_memory()` via Serena MCP
Lost when	Session ends or `/clear`	Explicitly deleted via `/memory`	Explicitly deleted from Serena
Requires	Nothing	Nothing (v2.1.59+)	Serena MCP server
Use case	Immediate working context	Key decisions, context snippets	Architectural decisions, patterns

Session Memory (short-term):

Everything in your current conversation
Files Claude has read, commands run, decisions made
Managed with /compact (compress) and /clear (reset)
Disappears when you close Claude Code

Auto-Memory (native, v2.1.59+):

Built into Claude Code — no MCP server or configuration required
Claude automatically saves useful context (decisions, patterns, preferences) to MEMORY.md files
Organized per-project: .claude/memory/MEMORY.md or ~/.claude/projects/<path>/memory/MEMORY.md
Managed with /memory: view, edit, or delete what’s been saved
Survives across sessions automatically

Persistent Memory (long-term, Serena MCP):

Requires Serena MCP server installed
Explicitly saved with write_memory("key", "value")
Survives across sessions
Ideal for: architectural decisions, API patterns, coding conventions

Pattern: End-of-Session Save

# Before ending a productive session:
"Save our authentication decision to memory:
- Chose JWT over sessions for scalability
- Token expiry: 15min access, 7d refresh
- Store refresh tokens in httpOnly cookies"

# Claude calls: write_memory("auth_decisions", "...")

# Next session:
"What did we decide about authentication?"
# Claude calls: read_memory("auth_decisions")

When to use which:

Session memory: Active problem-solving, debugging, exploration
Auto-memory: Decisions and context you want Claude to rediscover next session without manual effort (v2.1.59+)
Persistent memory (Serena): Structured key-value store for architectural decisions across many projects
CLAUDE.md: Team conventions, project structure (versioned with git)

Fresh Context Pattern (Ralph Loop)

The Problem: Context Rot

Research shows LLM performance degrades significantly with accumulated context:

20-30% performance gap between focused and polluted prompts (Chroma, 2025)
Degradation starts at ~16K tokens for Claude models
Failed attempts, error traces, and iteration history dilute attention

Instead of managing context within a session, you can restart with a fresh session per task while persisting state externally.

The Pattern

# Canonical "Ralph Loop" (Geoffrey Huntley)
while :; do cat TASK.md PROGRESS.md | claude -p ; done

State persists via:

TASK.md — Current task definition with acceptance criteria
PROGRESS.md — Learnings, completed tasks, blockers
Git commits — Each iteration commits atomically

Variant: tasks/lessons.md

A lightweight alternative for interactive sessions (no loop required): after each user correction, Claude updates tasks/lessons.md with the rule to avoid the same mistake. Reviewed at the start of each new session.

tasks/
├── todo.md      # Current plan (checkable items)
└── lessons.md   # Rules accumulated from corrections

The difference from PROGRESS.md: lessons.md captures behavioral rules (“always diff before marking done”, “never mock without asking”) rather than task state. It compounds over time — the mistake rate drops as the ruleset grows.

Traditional	Fresh Context
Accumulate in chat history	Reset per task
`/compact` to compress	State in files + git
Context bleeds across tasks	Each task gets full attention

When to Use

Situation	Use
Context 70-90%, staying interactive	`/compact`
Context 90%+, need fresh start	`/clear` then continue
Long autonomous run, task-based	Fresh Context Pattern
Overnight/AFK execution	Fresh Context Pattern

Good fit:

Autonomous sessions >1 hour
Migrations, large refactorings
Tasks with clear success criteria (tests pass, build succeeds)

Poor fit:

Interactive exploration
Design without clear spec
Tasks with slow/ambiguous feedback loops

Variant: Session-per-Concern Pipeline

Instead of looping the same task, dedicate a fresh session to each quality dimension:

Plan session — Architecture, scope, acceptance criteria
Test session — Write unit, integration, and E2E tests first (TDD)
Implement session — Code until all linters and tests pass
Review sessions — Separate sessions for security audit, performance, code review
Repeat — Iterate with scope adjustments as needed

This combines Fresh Context (clean 200K per phase) with OpusPlan (Opus for review/strategy sessions, Sonnet for implementation). Each session generates progress artifacts that feed the next.

Practical Implementation

Option 1: Manual loop

# Simple fresh-context loop
for i in {1..10}; do
    echo "=== Iteration $i ==="
    claude -p "$(cat TASK.md PROGRESS.md)"
    git diff --stat  # Check progress
    read -p "Continue? (y/n) " -n 1 -r
    [[ ! $REPLY =~ ^[Yy]$ ]] && break
done

Option 2: Script (see examples/scripts/fresh-context-loop.sh)

./fresh-context-loop.sh 10 TASK.md PROGRESS.md

Option 3: External orchestrators

AFK CLI — Zero-config orchestration across task sources

Task Definition Template

## Current Focus
[Single atomic task with clear deliverable]

## Acceptance Criteria
- [ ] Tests pass
- [ ] Build succeeds
- [ ] [Specific verification]

## Context
- Related files: [paths]
- Constraints: [rules]

## Do NOT
- Start other tasks
- Refactor unrelated code

Key Insight

/compact preserves conversation flow. Fresh context maximizes per-task attention at the cost of continuity.

Sources: Chroma Research - Context Rot | Ralph Loop Origin | METR - Long Task Capability | Anthropic - Context Engineering

What Consumes Context?

Action	Context Cost
Reading a small file	Low (~500 tokens)
Reading a large file	High (~5K+ tokens)
Running commands	Medium (~1K tokens)
Multi-file search	High (~3K+ tokens)
Long conversations	Accumulates

Context Depletion Symptoms

Learn to recognize when context is running out:

Symptom	Severity	Action
Shorter responses than usual	🟡 Warning	Continue with caution
Forgetting CLAUDE.md instructions	🟠 Serious	Document state, prepare checkpoint
Inconsistencies with earlier conversation	🔴 Critical	New session needed
Errors on code already discussed	🔴 Critical	New session needed
”I can’t access that file” (when it was read)	🔴 Critical	New session immediately

Context Inspection

Check your context usage in detail:

/context

Example output:

┌─────────────────────────────────────────────────────────────┐
│ CONTEXT USAGE                                    67% used   │
├─────────────────────────────────────────────────────────────┤
│ System Prompt          ████████░░░░░░░░░░░░░░░░  12,450 tk  │
│ System Tools           ██░░░░░░░░░░░░░░░░░░░░░░   3,200 tk  │
│ MCP Tools (5 servers)  ████████████░░░░░░░░░░░░  18,600 tk  │
│ Conversation           ████████████████████░░░░  89,200 tk  │
├─────────────────────────────────────────────────────────────┤
│ TOTAL                                           123,450 tk  │
│ REMAINING                                        76,550 tk  │
└─────────────────────────────────────────────────────────────┘

💡 The Last 20% Rule: Reserve ~20% of context for:

Multi-file operations at end of session
Last-minute corrections
Generating summary/checkpoint

Cost Awareness & Optimization

Claude Code isn’t free - you’re using API credits. Understanding costs helps optimize usage.

Pricing Model (as of February 2026)

The default model depends on your subscription: Max/Team Premium subscribers get Opus 4.6 by default, while Pro/Team Standard subscribers get Sonnet 4.6. If Opus usage hits the plan threshold, it auto-falls back to Sonnet.

Model	Input (per 1M tokens)	Output (per 1M tokens)	Context Window	Notes
Sonnet 4.6	$3.00	$15.00	200K tokens	Default model (Feb 2026)
Sonnet 4.5	$3.00	$15.00	200K tokens	Legacy (same price)
Opus 4.6 (standard)	$5.00	$25.00	200K tokens	Released Feb 2026
Opus 4.6 (1M context beta)	$10.00	$37.50	1M tokens	Requests >200K context
Opus 4.6 (fast mode)	$30.00	$150.00	200K tokens	2.5x faster, 6x price
Haiku 4.5	$0.80	$4.00	200K tokens	Budget option

Reality check: A typical 1-hour session costs $0.10 - $0.50 depending on usage patterns.

Model deprecations (Feb 2026): claude-3-haiku-20240307 (Claude 3 Haiku) was deprecated on February 19, 2026 with retirement scheduled for April 20, 2026. If your CLAUDE.md, agent definitions, or scripts hardcode this model ID, migrate to claude-haiku-4-5-20251001 (Haiku 4.5) before April 2026. Source: platform.claude.com/docs/model-deprecations

200K vs 1M Context: Performance, Cost & Use Cases

The 1M context window (beta, API + usage tier 4 required) is a significant capability jump — but community feedback consistently frames it as a niche premium tool, not a default.

Retrieval accuracy at scale (MRCR v2 8-needle 1M variant)

Model	256K accuracy	1M accuracy	Source
Opus 4.6	93%	76%	Anthropic blog + independent analysis (Feb 2026)
Sonnet 4.5	—	18.5%	Anthropic blog (Feb 2026)
Sonnet 4.6	Not yet published	Not yet published	—

The benchmark is the “8-needle 1M variant” — finding 8 specific facts in a 1M-token document. Opus 4.6 drops from 93% to 76% when scaling from 256K to 1M; Sonnet 4.5 collapses to 18.5%. Community validation: a developer loaded ~733K tokens (4 Harry Potter books) and Opus 4.6 retrieved 49/50 documented spells in a single prompt (HN, Feb 2026). Sonnet 4.6 MRCR not yet published, but community reports suggest it “struggles with following specific instructions and retrieving precise information” at full 1M context.

Cost per session (approximate)

Above 200K input tokens, all tokens in the request are charged at premium rates — not just the excess. Applies to both Sonnet 4.6 and Opus 4.6.

Session type	~Tokens in	~Tokens out	Sonnet 4.6	Opus 4.6
Bug fix / PR review (≤200K)	50K	5K	~$0.23	~$0.38
Module refactoring (≤200K)	150K	20K	~$0.75	~$1.25
Full service analysis (>200K, 1M beta)	500K	50K	~$4.13	~$6.88

For comparison: Gemini 1.5 Pro offers a 2M context window at $3.50/$10.50/MTok — significantly cheaper for pure long-context RAG. Community advice: use Gemini for large-document RAG, Claude for reasoning quality and agentic workflows.

When to use which

Scenario	Recommendation
Bug fix, PR review, daily coding	Sonnet 4.6 @ 200K — fast and cheap
Full-repo audit, entire codebase load	Opus 4.6 @ 1M — worth the cost for precision
Cross-module refactoring	Sonnet 4.6 @ 1M — but weigh cost vs. chunking + RAG
Architecture analysis, Agent Teams	Opus 4.6 @ 1M — strongest retrieval at scale
Large-document RAG (PDFs, legal, books)	Consider Gemini 1.5 Pro — cheaper at this scale

Key facts

Opus 4.6 max output: 128K tokens; Sonnet 4.6 max output: 64K tokens
1M context ≈ 30,000 lines of code / 750,000 words
1M context is beta — requires anthropic-beta: context-1m-2025-08-07 header, usage tier 4 or custom rate limits
Above 200K input tokens: Sonnet 4.6 doubles to $6/$22.50/MTok; Opus 4.6 doubles to $10/$37.50/MTok
If input stays ≤200K, standard pricing applies even with the beta flag enabled
Practical workaround: check context at ~70% and open a new session rather than hitting compaction (HN pattern)
Community consensus: 200K + RAG is the default; 1M Opus is reserved for cases where loading everything at once is genuinely necessary

What Costs the Most?

Action	Tokens Consumed	Estimated Cost
Read a 100-line file	~500	$0.0015
Read 10 files (1000 lines)	~5,000	$0.015
Long conversation (20 messages)	~30,000	$0.090
MCP tool call (Serena, Context7)	~2,000	$0.006
Running tests (with output)	~3,000-10,000	$0.009-$0.030
Code generation (100 lines)	~2,000 output	$0.030

The expensive operations:

Reading entire large files - 2000+ line files add up fast
Multiple MCP server calls - Each server adds ~2K tokens overhead
Long conversations without /compact - Context accumulates
Repeated trial and error - Each iteration costs

Cost Optimization Strategies

Strategy 1: Be specific in queries

# ❌ Expensive - reads entire file
"Check auth.ts for issues"
# ~5K tokens if file is large

# ✅ Cheaper - targets specific location
"Check the login function in auth.ts:45-60"
# ~500 tokens

Strategy 2: Use /compact proactively

# Without /compact - conversation grows
Context: 10% → 30% → 50% → 70% → 90%
Cost per message increases as context grows

# With /compact at 70%
Context: 10% → 30% → 50% → 70% → [/compact] → 30% → 50%
Frees significant context space for subsequent messages

Strategy 3: Choose the right model

# Use Haiku for simple tasks (4x cheaper input, 3.75x cheaper output)
claude --model haiku "Fix this typo in README.md"

# Use Sonnet (default) for standard work
claude "Refactor this module"

# Use Opus only for critical/complex tasks
claude --model opus "Design the entire authentication system"

Strategy 4: Limit MCP servers

// ❌ Expensive - 5 MCP servers loaded
{
  "mcpServers": {
    "serena": {...},
    "context7": {...},
    "sequential": {...},
    "playwright": {...},
    "postgres": {...}
  }
}
// ~10K tokens overhead per session

// ✅ Cheaper - load only what you need
{
  "mcpServers": {
    "serena": {...}  // Only for this project
  }
}
// ~2K tokens overhead

Strategy 5: Batch operations

# ❌ Expensive - 5 separate prompts
"Read file1.ts"
"Read file2.ts"
"Read file3.ts"
"Read file4.ts"
"Read file5.ts"

# ✅ Cheaper - single batched request
"Read file1.ts, file2.ts, file3.ts, file4.ts, file5.ts and analyze them together"
# Shared context, single response

Strategy 6: Use prompt caching for repeated context (API)

If you call the Anthropic API directly (e.g., for custom agents or pipelines), prompt caching cuts costs by up to 90% on repeated prefixes.

# Mark stable sections with cache_control
response = client.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "<your large system prompt / codebase context>",
            "cache_control": {"type": "ephemeral"}  # Cache this prefix
        }
    ],
    messages=[{"role": "user", "content": "Fix the bug in auth.ts"}]
)

Prompt caching economics:

Operation	Cost multiplier	TTL
Cache write	1.25x base price	5 minutes (default)
Cache write (extended)	2x base price	1 hour
Cache read (hit)	0.1x base price	—
Latency reduction	Up to 85% for long prompts	—

Break-even: 2 cache hits with 5-minute TTL. After that, pure savings.

Rules:

Max 4 cache breakpoints per request
Cache key = exact prefix match (single character change = cache miss)
Place breakpoints after large stable sections: system prompt, tool definitions, codebase context
For Claude Code itself: caching is handled automatically by the CLI — this applies to API-based workflows you build on top of Claude

Docs: prompt caching

Tracking Costs

Real-time tracking:

The status line shows current session cost:

Claude Code │ Ctx(u): 45% │ Cost: $0.23 │ Session: 1h 23m
                              ↑ Current session cost

Advanced tracking with ccusage:

The ccusage CLI tool provides detailed cost analytics beyond the /cost command:

ccusage                    # Overview all periods
ccusage --today            # Today's costs
ccusage --month            # Current month
ccusage --session          # Active session breakdown
ccusage --model-breakdown  # Cost by model (Sonnet/Opus/Haiku)

Example output:

┌──────────────────────────────────────────────────────┐
│ USAGE SUMMARY - January 2026                         │
├──────────────────────────────────────────────────────┤
│ Today                           $2.34 (12 sessions)  │
│ This week                       $8.91 (47 sessions)  │
│ This month                     $23.45 (156 sessions) │
├──────────────────────────────────────────────────────┤
│ MODEL BREAKDOWN                                      │
│   Sonnet 3.5    85%    $19.93                        │
│   Opus 4.6      12%     $2.81                        │
│   Haiku 3.5      3%     $0.71                        │
└──────────────────────────────────────────────────────┘

Why use ccusage over /cost?

Historical trends: Track usage patterns over days/weeks/months
Model breakdown: See which model tier drives costs
Budget planning: Set monthly spending targets
Team analytics: Aggregate costs across developers

For a full inventory of community cost trackers, session viewers, config managers, and alternative UIs, see Third-Party Tools.

Monthly tracking:

Check your Anthropic Console for detailed usage:

https://console.anthropic.com/settings/usage

Cost budgeting:

# Set a mental budget per session
- Quick task (5-10 min): $0.05-$0.10
- Feature work (1-2 hours): $0.20-$0.50
- Deep refactor (half day): $1.00-$2.00

# If you're consistently over budget:
1. Use /compact more often
2. Be more specific in queries
3. Consider using Haiku for simpler tasks
4. Reduce MCP servers

Cost vs. Value

Perspective on costs: If Claude Code saves you meaningful time on a task, the API cost is usually negligible compared to your hourly rate. Don’t over-optimize for token costs at the expense of productivity.

When to optimize:

✅ You’re on a tight budget (student, hobbyist)
✅ High-volume usage (>4 hours/day)
✅ Team usage (5+ developers)

When NOT to optimize:

❌ Your time is more expensive than API costs
❌ You’re spending more time optimizing than the savings
❌ Optimization hurts productivity (being too restrictive)

Cost-Conscious Workflows

For solo developers on a budget:

1. Start with Haiku for exploration/planning
2. Switch to Sonnet for implementation
3. Use /compact aggressively (every 50-60% context)
4. Limit to 1-2 MCP servers
5. Be specific in all queries
6. Batch operations when possible

Monthly cost estimate: $5-$15 for 20-30 hours

For professional developers:

1. Use Sonnet as default (optimal balance)
2. Use /compact when needed (70%+ context)
3. Use full MCP setup (productivity matters)
4. Don't micro-optimize queries
5. Use Opus for critical architectural decisions

Monthly cost estimate: $20-$50 for 40-80 hours

For teams:

1. Shared MCP infrastructure (Context7, Serena)
2. Standardized CLAUDE.md to avoid repeated explanations
3. Agent library to avoid rebuilding patterns
4. CI/CD integration for automation
5. Track costs per developer in Anthropic Console

Monthly cost estimate: $50-$200 for 5-10 developers

Red Flags (Cost Waste Indicators)

Indicator	Cause	Fix
Sessions consistently >$1	Not using `/compact`	Set reminder at 70% context
Cost per message >$0.05	Context bloat	Start fresh `/clear`
>$5/day for hobby project	Over-using or inefficient queries	Review query specificity
Haiku failing simple tasks	Using wrong model tier	Use Sonnet for anything non-trivial

Subscription Plans & Limits

Note: Anthropic’s plans evolve frequently. Always verify current pricing and limits at claude.com/pricing.

How Subscription Limits Work

Unlike API usage (pay-per-token), subscriptions use a hybrid model that’s deliberately opaque:

Concept	Description
5-hour rolling window	Primary limit; resets when you send next message after 5 hours lapse
Weekly aggregate cap	Secondary limit; resets every 7 days. Both apply simultaneously
Hybrid counting	Advertised as “messages” but actual capacity is token-based, varying by code complexity, file size, and context
Model weighting	Opus consumes 8-10× more quota than Sonnet for equivalent work

Approximate Token Budgets by Plan (Jan 2026, community-verified)

Plan	5-Hour Token Budget	Claude Code prompts/5h	Weekly Sonnet Hours	Weekly Opus Hours	Claude Code Access
Free	0	0	0	0	❌ None
Pro ($20/mo)	~44,000 tokens	~10-40 prompts	40-80 hours	N/A (Sonnet only)	✅ Limited
Max 5x ($100/mo)	~88,000-220,000 tokens	~50-200 prompts	140-280 hours	15-35 hours	✅ Full
Max 20x ($200/mo)	~220,000+ tokens	~200-800 prompts	240-480 hours	24-40 hours	✅ Full

Warning: These are community-measured estimates. Anthropic does not publish exact token limits, and limits have been reduced without announcement (notably Oct 2025). The 8-10× Opus/Sonnet ratio means Max 20x users get only ~24-40 Opus hours weekly despite paying $200/month. “Prompts/5h” is a rough practical translation of the token budget — actual capacity varies significantly with task complexity, context size, and sub-agent usage. Monthly cap: ~50 active 5-hour windows across all plans.

Why “Hours” Are Misleading

The term “hours of Sonnet 4” refers to elapsed wall-clock time during active processing, not calendar hours. This is not directly convertible to tokens without knowing:

Code complexity (larger files = higher per-token overhead)
Tool usage (Bash execution adds ~245 input tokens per call; text editor adds ~700)
Context re-reads and caching misses

Tier-Specific Strategies

If you have…	Recommended approach
Pro plan	Sonnet only; batch sessions, avoid context bloat
Limited Opus quota	OpusPlan essential: Opus for planning, Sonnet for execution
Max 5x	Sonnet default, Opus only for architecture/complex debugging
Max 20x	More Opus freedom, but still monitor weekly usage (24-40h goes fast)

The Pro User Pattern (validated by community):

1. Opus → Create detailed plan (high-quality thinking)
2. Sonnet/Haiku → Execute the plan (cost-effective implementation)
3. Result: Best reasoning where it matters, lower cost overall

This is exactly what OpusPlan mode does automatically (see Section 2.3).

Monitoring Your Usage

/status    # Shows current session: cost, context %, model

Anthropic provides no in-app real-time usage metrics. Community tools like ccusage help track token consumption across sessions.

For subscription usage history: Check your Anthropic Console or Claude.ai settings.

Historical Note: In October 2025, users reported significant undocumented limit reductions coinciding with Sonnet 4.5’s release. Pro users who previously sustained 40-80 Sonnet hours weekly reported hitting limits after only 6-8 hours. Anthropic acknowledged the limits but did not explain the discrepancy.

Context Poisoning (Bleeding)

Definition: When information from one task contaminates another.

Pattern 1: Style Bleeding

Task 1: "Create a blue button"
Claude: [Creates blue button]

Task 2: "Create a form"
Claude: [Creates form... with all buttons blue!]
        ↑ The "blue" bled into the new task

Solution: Use explicit boundaries
"---NEW TASK---
Create a form. Use default design system colors."

Pattern 2: Instruction Contamination

Instruction 1: "Always use arrow functions"
Instruction 2: "Follow project conventions" (which uses function)

Claude: [Paralyzed, alternating between styles]

Solution: Clarify priority
"In case of conflict, project conventions take precedence over my preferences."

Pattern 3: Temporal Confusion

Early session: "auth.ts contains login logic"
... 2h of work ...
You renamed auth.ts to authentication.ts

Claude: "I'll modify auth.ts..."
        ↑ Using outdated info

Solution: Explicit updates
"Note: auth.ts was renamed to authentication.ts"

Context Hygiene Checklist:

New tasks = explicit markdown boundaries
Structural changes = inform Claude explicitly
Contradictory instructions = clarify priority
Long session (>2h) = consider /clear or new session
Erratic behavior = check with /context

Sanity Check Technique

Verify that Claude has loaded your configuration correctly.

Simple Method:

Add at the top of CLAUDE.md:

# My name is [Your Name]
# Project: [Project Name]
# Stack: [Your tech stack]

Ask Claude: “What is my name? What project am I working on?”
If correct → Configuration loaded properly

Advanced: Multiple Checkpoints

# === CHECKPOINT 1 === Project: MyApp ===

[... 500 lines of instructions ...]

# === CHECKPOINT 2 === Stack: Next.js ===

[... 500 lines of instructions ...]

# === CHECKPOINT 3 === Owner: [Name] ===

Ask “What is checkpoint 2?” to verify Claude read that far.

Failure Symptom	Probable Cause	Solution
Doesn’t know your name	CLAUDE.md not loaded	Check file location
Inconsistent answers	Typo in filename	Must be `CLAUDE.md` (not `clause.md`)
Partial knowledge	Context exhausted	`/clear` or new session

Session Handoff Pattern

When ending a session or switching contexts, create a handoff document to maintain continuity.

Purpose: Bridge the gap between sessions by documenting state, decisions, and next steps.

Template:

# Session Handoff - [Date] [Time]

## What Was Accomplished
- [Key task 1 completed]
- [Key task 2 completed]
- [Files modified: list]

## Current State
- [What's working]
- [What's partially done]
- [Known issues or blockers]

## Decisions Made
- [Architectural choice 1: why]
- [Technology selection: rationale]
- [Trade-offs accepted]

## Next Steps
1. [Immediate next task]
2. [Dependent task]
3. [Follow-up validation]

## Context for Next Session
- Branch: [branch-name]
- Key files: [list 3-5 most relevant]
- Dependencies: [external factors]

When to create handoff documents:

Scenario	Why
End of work day	Resume seamlessly tomorrow
Before context limit	Preserve state before `/clear`
Switching focus areas	Different task requires fresh context
Interruption expected	Emergency or meeting disrupts work
Complex debugging	Document hypotheses and tests tried

Storage location: claudedocs/handoffs/handoff-YYYY-MM-DD.md

Pro tip: Ask Claude to generate the handoff:

You: "Create a session handoff document for what we accomplished today"

Claude will analyze git status, conversation history, and generate a structured handoff.

2.3 Plan Mode

Plan Mode is Claude Code’s “look but don’t touch” mode.

Entering Plan Mode

/plan

Or ask Claude directly:

You: Let's plan this feature before implementing

What Plan Mode Allows

✅ Reading files
✅ Searching the codebase
✅ Analyzing architecture
✅ Proposing approaches
✅ Writing to a plan file

What Plan Mode Prevents

❌ Editing files
❌ Running commands that modify state
❌ Creating new files
❌ Making commits

When to Use Plan Mode

Situation	Use Plan Mode?
Exploring unfamiliar codebase	✅ Yes
Investigating a bug	✅ Yes
Planning a new feature	✅ Yes
Fixing a typo	❌ No
Quick edit to known file	❌ No

Recommended frequency: Boris Cherny (Head of Claude Code at Anthropic) starts approximately 80% of tasks in Plan Mode — letting Claude plan before writing a single line of code. Once the plan is approved, execution is almost always correct on the first try. — Lenny’s Newsletter, February 19, 2026

Exiting Plan Mode

Press Shift+Tab to toggle back to Normal Mode (Act Mode). You can also type a message and Claude will ask: “Ready to implement this plan?”

Note: Shift+Tab toggles between Plan Mode and Normal Mode during a session. Use Shift+Tab twice from Normal Mode to enter Plan Mode, once from Plan Mode to return.

Auto Plan Mode

Concept: Automatically trigger planning mode before any risky operation.

Configuration File (~/.claude/auto-plan-mode.txt):

Before executing ANY tool (Read, Write, Edit, Bash, Grep, Glob, WebSearch), you MUST:
1. FIRST: Use exit_plan_mode tool to present your plan
2. WAIT: For explicit user approval before proceeding
3. ONLY THEN: Execute the planned actions

Each new user request requires a fresh plan - previous approvals don't carry over.

Launch with Auto Plan Mode:

macOS/Linux:

# Direct
claude --append-system-prompt "Before executing ANY tool..."

# Via file (recommended)
claude --append-system-prompt "$(cat ~/.claude/auto-plan-mode.txt)"

# Alias in .zshrc/.bashrc
alias claude-safe='claude --append-system-prompt "$(cat ~/.claude/auto-plan-mode.txt)"'

Windows (PowerShell):

# Create the config file at %USERPROFILE%\.claude\auto-plan-mode.txt with the same content

# Direct
claude --append-system-prompt "Before executing ANY tool..."

# Via file (add to $PROFILE)
function claude-safe {
    $planPrompt = Get-Content "$env:USERPROFILE\.claude\auto-plan-mode.txt" -Raw
    claude --append-system-prompt $planPrompt $args
}

Resulting Workflow:

User: "Add an email field to the User model"

Claude (Auto Plan Mode active):
┌─────────────────────────────────────────────────────────────┐
│ 📋 PROPOSED PLAN                                            │
│                                                             │
│ 1. Read schema.prisma to understand current model           │
│ 2. Add field email: String? @unique                         │
│ 3. Generate Prisma migration                                │
│ 4. Update TypeScript types                                  │
│ 5. Add Zod validation in routers                            │
│                                                             │
│ ⚠️ Impact: 3 files modified, 1 migration created            │
│                                                             │
│ Approve this plan? (y/n)                                    │
└─────────────────────────────────────────────────────────────┘

User: "y"

Claude: [Executes the plan]

Result: 76% fewer tokens with better results because the plan is validated before execution.

Model Aliases

Claude Code supports six model aliases via /model (each always resolves to the latest version):

Alias	Resolves To	Use Case
`default`	Latest model for your plan tier	Standard usage
`sonnet`	Claude Sonnet 4.6	Fast, cost-efficient
`opus`	Claude Opus 4.6	Deep reasoning
`haiku`	Claude Haiku 4.5	Budget, high-volume
`sonnet[1m]`	Sonnet with 1M context	Large codebases
`opusplan`	Opus (plan) + Sonnet (act)	Hybrid intelligence

Model can also be set via claude --model <alias>, ANTHROPIC_MODEL env var, or "model" in settings.json. Priority: /model > --model flag > ANTHROPIC_MODEL > settings.json.

OpusPlan Mode

Concept: Use Opus for planning (superior reasoning) and Sonnet for implementation (cost-efficient).

Why OpusPlan?

Cost optimization: Opus tokens cost more than Sonnet
Best of both worlds: Opus-quality planning + Sonnet-speed execution
Token savings: Planning is typically shorter than implementation

Activation:

/model opusplan

Or in ~/.claude/settings.json:

{
  "model": "opusplan"
}

How It Works:

In Plan Mode (/plan or Shift+Tab twice) → Uses Opus
In Act Mode (normal execution) → Uses Sonnet
Automatic switching based on mode

Recommended Workflow:

1. /model opusplan        → Enable OpusPlan
2. Shift+Tab × 2          → Enter Plan Mode (Opus)
3. Describe your task     → Get Opus-quality planning
4. Shift+Tab              → Exit to Act Mode (Sonnet)
5. Execute the plan       → Sonnet implements efficiently

Alternative Approach with Subagents:

You can also control model usage per agent:

---
name: planner
model: opus
tools: Read, Grep, Glob
---
# Strategic Planning Agent

---
name: implementer
model: haiku
tools: Write, Edit, Bash
---
# Fast Implementation Agent

Pro Users Note: OpusPlan is particularly valuable for Pro subscribers with limited Opus tokens. It lets you leverage Opus reasoning for critical planning while preserving tokens for more sessions.

Budget Variant: SonnetPlan (Community Hack)

opusplan is hardcoded to Opus+Sonnet — there’s no native sonnetplan alias. But you can remap what the opus and sonnet aliases resolve to via environment variables, effectively creating a Sonnet→Haiku hybrid:

# Add to ~/.zshrc
sonnetplan() {
    ANTHROPIC_DEFAULT_OPUS_MODEL=claude-sonnet-4-6 \
    ANTHROPIC_DEFAULT_SONNET_MODEL=claude-haiku-4-5-20251001 \
    claude "$@"
}

With sonnetplan, /model opusplan routes:

Plan Mode → Sonnet 4.6 (via remapped opus alias)
Act Mode → Haiku 4.5 (via remapped sonnet alias)

Caveat: The model’s self-report (what model are you?) is unreliable — models don’t always know their own identity. Trust the status bar (Model: Sonnet 4.6 in plan mode) or verify via billing dashboard. GitHub issue #9749 tracks native support.

Rev the Engine

Concept: Run multiple rounds of planning and deep thinking before executing. Like warming up an engine before driving.

Standard workflow: think → plan → execute. Rev the Engine: think → plan → think harder → refine plan → think hardest → finalize → execute.

When to use:

Critical architectural decisions (irreversible, high-impact)
Complex migrations affecting 10+ files
Unfamiliar domain where first instincts are often wrong

Pattern:

## Round 1: Initial analysis
User: /plan
User: Analyze the current auth system. What are the key components,
      dependencies, and potential risks of migrating to OAuth2?
Claude: [Initial analysis]

## Round 2: Deep challenge
User: Now use extended thinking. Challenge your own analysis:
      - What assumptions did you make?
      - What failure modes did you miss?
      - What would a senior security engineer flag?
Claude: [Deeper analysis with self-correction]

## Round 3: Final plan
User: Based on both rounds, write the definitive migration plan.
      Include rollback strategy and risk mitigation for each step.
Claude: [Refined plan incorporating both rounds]

## Execute
User: /execute
User: Implement the plan from round 3.

Why it works: Each round forces Claude to reconsider assumptions. Round 2 typically catches 30-40% of issues that round 1 missed. Round 3 synthesizes into a more robust plan.

📊 Empirical backing — Anthropic AI Fluency Index (Feb 2026)

An Anthropic study analyzing 9,830 Claude conversations quantifies exactly why plan review works: users who iterate and question the AI’s reasoning are 5.6× more likely to catch missing context and errors compared to users who accept the first output. A second round of review makes you 4× more likely to identify what was left out.

The Rev the Engine pattern operationalizes this finding: each round of deep challenge triggers the questioning behavior that produces measurably better plans.

Source: Swanson et al., “The AI Fluency Index”, Anthropic (2026-02-23) — anthropic.com/research/AI-fluency-index

Mechanic Stacking

Concept: Layer multiple Claude Code mechanisms for maximum intelligence on critical decisions.

Layer 1: Plan Mode          → Safe exploration, no side effects
Layer 2: Extended Thinking  → Deep reasoning with thinking tokens
Layer 3: Rev the Engine     → Multi-round refinement
Layer 4: Split-Role Agents  → Multi-perspective analysis
Layer 5: Permutation        → Systematic variation testing

You don’t need all layers for every task. Match the stack depth to the decision’s impact:

Decision Impact	Stack Depth	Example
Low (fix typo)	0 layers	Just do it
Medium (add feature)	1-2 layers	Plan Mode + Extended Thinking
High (architecture)	3-4 layers	Rev the Engine + Split-Role
Critical (migration)	4-5 layers	Full stack

Anti-pattern: Stacking on trivial decisions. If the change is reversible and low-risk, just execute. Over-planning is as wasteful as under-planning.

Cross-references:

Permutation Frameworks: See §9.19
Split-Role Sub-Agents: See Sub-Agent Isolation
Extended Thinking: See §9.1 The Trinity

2.4 Rewind

Rewind is Claude Code’s undo mechanism.

Using Rewind

Access via Esc + Esc (double-tap Escape) or the /rewind command. This opens a scrollable checkpoint list.

What Rewind Does

Rewind provides four distinct actions from the checkpoint list:

Action	Effect
Restore code and conversation	Revert both file changes and conversation to selected point
Restore conversation	Keep current code, rewind conversation only
Restore code	Revert file changes, keep conversation
Summarize from here	Compress conversation from selected point forward (frees space without reverting)

Key distinction: Restore = undo (reverts state). Summarize = compress (frees space without reverting). Checkpoints persist across sessions (30-day cleanup).

Limitations

Only works on Claude’s changes (not manual edits)
Works within the current session
Git commits are NOT automatically reverted

Best Practice: Checkpoint Before Risk

Before a risky operation:

You: Let's commit what we have before trying this experimental approach

This creates a git checkpoint you can always return to.

Recovery Ladder: Three Levels of Undo

When things go wrong, you have multiple recovery options. Use the lightest-weight approach that solves your problem:

┌─────────────────────────────────────────────────────────┐
│               RECOVERY LADDER                           │
├─────────────────────────────────────────────────────────┤
│                                                         │
│   Level 3: Git Restore (nuclear option)                 │
│   ─────────────────────────────────────                 │
│   • git checkout -- <file>    (discard uncommitted)     │
│   • git stash                 (save for later)          │
│   • git reset --hard HEAD~1   (undo last commit)        │
│   • Works for: Manual edits, multiple sessions          │
│                                                         │
│   Level 2: /rewind (session undo)                       │
│   ─────────────────────────────                         │
│   • Reverts Claude's recent file changes                │
│   • Works within current session only                   │
│   • Doesn't touch git commits                           │
│   • Works for: Bad code generation, wrong direction     │
│                                                         │
│   Level 1: Reject Change (inline)                       │
│   ────────────────────────────                          │
│   • Press 'n' when reviewing diff                       │
│   • Change never applied                                │
│   • Works for: Catching issues before they happen       │
│                                                         │
└─────────────────────────────────────────────────────────┘

When to use each level:

Scenario	Recovery Level	Command
Claude proposed bad code	Level 1	Press `n`
Claude made changes, want to undo	Level 2	`/rewind`
Changes committed, need full rollback	Level 3	`git reset`
Experimental branch went wrong	Level 3	`git checkout main`
Context corrupted, strange behavior	Fresh start	`/clear` + restate goal

Pro tip: The /rewind command shows a list of changes to undo. You can selectively revert specific files rather than all changes.

Checkpoint Pattern: Safe Experimentation

For systematic experimentation, use the checkpoint pattern to create safe restore points:

┌─────────────────────────────────────────────────────────┐
│              CHECKPOINT WORKFLOW                        │
├─────────────────────────────────────────────────────────┤
│                                                         │
│   1. Create checkpoint                                  │
│   ──────────────────                                    │
│   git stash push -u -m "checkpoint-before-refactor"     │
│   (saves all changes including untracked files)         │
│                                                         │
│   2. Experiment freely                                  │
│   ──────────────────                                    │
│   Try risky refactoring, architectural changes, etc.    │
│   If it works → commit normally                         │
│   If it fails → restore checkpoint                      │
│                                                         │
│   3. Restore checkpoint                                 │
│   ──────────────────                                    │
│   git stash list              # find your checkpoint    │
│   git stash apply stash@{0}   # restore without delete  │
│   # or                                                  │
│   git stash pop stash@{0}     # restore and delete      │
│                                                         │
└─────────────────────────────────────────────────────────┘

Automated checkpoint: Create a Stop hook to auto-checkpoint on session end:

# See: examples/hooks/bash/auto-checkpoint.sh

# Automatically creates git stash on session end
# Naming: claude-checkpoint-{branch}-{timestamp}
# Logs to: ~/.claude/logs/checkpoints.log

Common workflows:

Scenario	Workflow
Risky refactor	Checkpoint → Try → Commit or restore
A/B testing approaches	Checkpoint → Try A → Restore → Try B → Compare
Incremental migration	Checkpoint → Migrate piece → Test → Repeat
Prototype exploration	Checkpoint → Experiment → Discard cleanly

Benefits over branching:

Faster than creating feature branches
Preserves uncommitted changes
Lightweight for quick experiments
Works across multiple files

2.5 Model Selection & Thinking Guide

Choosing the right model for each task is the fastest ROI improvement most Claude Code users can make. One decision per task — no overthinking.

Quick jump: Decision Table · Effort Levels · Model per Agent · When Thinking Helps

Cross-references: OpusPlan Mode · Rev the Engine · Cost Awareness

Decision Table

Task	Model	Effort	Est. cost/task
Rename, format, boilerplate	Haiku	low	~$0.02
Generate unit tests	Haiku	low	~$0.03
CI/CD PR review (volume)	Haiku	low	~$0.02
Feature dev, standard debug	Sonnet	medium	~$0.23
Module refactoring	Sonnet	high	~$0.75
System architecture	Opus	high	~$1.25
Critical security audit	Opus	max	~$2+
Multi-agent orchestration	Sonnet + Haiku	mixed	variable

Note on costs: Estimates based on API pricing (Haiku $0.80/$4.00 per MTok, Sonnet $3/$15, Opus $5/$25). Pro/Max subscribers pay a flat rate, so prioritize quality over cost. See Section 2.2 for full pricing breakdown.

Budget modifier (Teams Standard/Pro): downgrade one tier per phase — use Sonnet where the table says Opus, Haiku where it says Sonnet for mechanical implementation tasks. Community pattern: Sonnet for Plan → Haiku for Implementation on a $25/mo Teams Standard plan.

Effort Levels

The effort parameter (Opus 4.6 API) controls the model’s overall computational budget — not just thinking tokens, but tool calls, verbosity, and analysis depth. Low effort = fewer tool calls, no preamble. High effort = more explanations, detailed analysis.

Calibrated gradient — one real prompt per level:

low — Mechanical, no design decisions needed

"Rename getUserById to findUserById across src/" — Find-replace scope, zero reasoning required.
medium — Clear pattern, defined scope, one concern

"Convert fetchUser() in api/users.ts from callbacks to async/await" — Pattern is known, scope bounded.
high — Design decisions, edge cases, multiple concerns

"Redesign error handling in the payment module: add retry logic, partial failure recovery, and idempotency guarantees" — Architectural choices, not just pattern application.
max (Opus 4.6 only — returns error on other models) — Cross-system reasoning, irreversible decisions

"Analyze the microservices event pipeline for race conditions across order-service, inventory-service, and notification-service" — Multi-service hypothesis testing, adversarial thinking.

Model per Agent Patterns

Assign models to agents based on role, not importance:

Planner (examples/agents/planner.md) — Strategy, read-only exploration

---
name: planner
description: Strategic planning agent — read-only. Use before implementation.
model: opus
tools: Read, Grep, Glob
---

Implementer (examples/agents/implementer.md) — Mechanical execution, bounded scope

---
name: implementer
description: Mechanical execution agent. Scope must be defined explicitly in the task.
model: haiku
tools: Write, Edit, Bash, Read, Grep, Glob
---

Note: Haiku is for mechanical tasks only. If the implementation requires design decisions or complex business logic, use Sonnet — state this in the task prompt.

Architecture Reviewer (examples/agents/architecture-reviewer.md) — Critical design review

---
name: architecture-reviewer
description: Architecture and design review — read-only. Never modifies code.
model: opus
tools: Read, Grep, Glob
---

Pro tip: Add a model reminder to your CLAUDE.md:
# Model reminder
Default: Sonnet. Haiku for mechanical tasks. Opus for architecture and security audits.

When Thinking Helps vs. Wastes Tokens

Scenario	Thinking	Reason
Rename 50 files	OFF	Zero reasoning — pure mechanics
Bug spanning 3+ services	ON (high)	Multi-layer hypothesis testing
Boilerplate / test generation	OFF	Repetitive pattern, no decisions
Architecture migration	ON (max)	Irreversible decisions
Direct factual questions	OFF (low)	Immediate answer sufficient
Security code review	ON (high)	Adversarial reasoning needed

Toggle: Alt+T (current session) · /config (permanent)

2.6 Mental Model

Understanding how Claude Code “thinks” makes you more effective.

Claude’s View of Your Project

┌─────────────────────────────────────────────────────────┐
│                   YOUR PROJECT                          │
├─────────────────────────────────────────────────────────┤
│                                                         │
│   ┌─────────────┐    ┌─────────────┐    ┌───────────┐   │
│   │   Files     │    │   Git       │    │  Config   │   │
│   │   (.ts,.py) │    │   History   │    │  Files    │   │
│   └─────────────┘    └─────────────┘    └───────────┘   │
│          │                  │                  │        │
│          ▼                  ▼                  ▼        │
│   ┌─────────────────────────────────────────────────┐   │
│   │              Claude's Understanding             │   │
│   │   - File structure & relationships              │   │
│   │   - Code patterns & conventions                 │   │
│   │   - Recent changes (from git)                   │   │
│   │   - Project rules (from CLAUDE.md)              │   │
│   └─────────────────────────────────────────────────┘   │
│                                                         │
└─────────────────────────────────────────────────────────┘

What Claude Knows

File Structure: Claude can navigate and search your files
Code Content: Claude can read and understand code
Git State: Claude sees branches, commits, changes
Project Rules: Claude reads CLAUDE.md for conventions

What Claude Doesn’t Know

Runtime State: Claude can’t see running processes
External Services: Claude can’t access your databases directly
Your Intent: Claude needs clear instructions
Hidden Files: Claude respects .gitignore by default

⚠️ Pattern Amplification: Claude mirrors the patterns it finds. In well-structured codebases, it produces consistent, idiomatic code. In messy codebases without clear abstractions, it perpetuates the mess. If your code lacks good patterns, provide them explicitly in CLAUDE.md or use semantic anchors (Section 2.9).

You Are the Main Thread

Think of yourself as a CPU scheduler. Claude Code instances are worker threads. You don’t write the code—you orchestrate the work.

┌─────────────────────────────────────────┐
│          YOU (Main Thread)              │
│  ┌────────────────────────────────────┐ │
│  │  Responsibilities:                 │ │
│  │  • Define tasks and priorities     │ │
│  │  • Allocate context budgets        │ │
│  │  • Review outputs                  │ │
│  │  • Make architectural decisions    │ │
│  │  • Handle exceptions/escalations   │ │
│  └────────────────────────────────────┘ │
│         │          │          │         │
│    ┌────▼───┐ ┌────▼───┐ ┌────▼───┐    │
│    │Worker 1│ │Worker 2│ │Worker 3│    │
│    │(Claude)│ │(Claude)│ │(Claude)│    │
│    │Feature │ │Tests   │ │Review  │    │
│    └────────┘ └────────┘ └────────┘    │
└─────────────────────────────────────────┘

Implications:

Don’t write code when Claude can. Your time is for decisions, not keystrokes.
Don’t micromanage. Give clear instructions, then review results.
Context-switch deliberately. Like a scheduler, batch similar tasks.
Escalate to yourself. When Claude is stuck, step in—then hand back.

This mental model scales: one developer can orchestrate 2-5 Claude instances on independent tasks (see §9.17 Scaling Patterns).

From Chatbot to Context System

The most common mistake is treating Claude Code like a chatbot — typing ad-hoc requests and hoping for good output. What separates casual usage from production workflows is a shift in thinking:

Chatbot mode: You write good prompts. Context system: You build structured context that makes every prompt better.

“Stop treating it like a chatbot. Give it structured context. CLAUDE.md, hooks, skills, project memory. Changes everything.” — Robin Lorenz, AI Engineer (comment)

Claude Code has four layers of persistent context that compound over time:

Layer	What It Does	Section	When to Set Up
CLAUDE.md	Persistent rules, conventions, project knowledge	§3.1	Week 1
Skills	Reusable knowledge modules for consistent workflows	§5	Week 2
Hooks	Automated guardrails (lint, security, formatting)	§7	Week 2-3
Project memory	Cross-session decisions and architectural context	§3.1	Ongoing

These are not independent features. They are layers of the same system:

CLAUDE.md teaches Claude what your project needs (conventions, stack, patterns)
Skills teach Claude how to perform specific workflows (review, deploy, test)
Hooks enforce guardrails automatically (block secrets, auto-format, run linting)
Memory preserves decisions across sessions (architectural choices, resolved tradeoffs)

Before (chatbot mode):

“Use pnpm, not npm. And remember our naming convention is…” (Every session. Every time. Copy-pasting context.)

After (context system):

CLAUDE.md loads conventions automatically. Skills ensure consistent workflows. Hooks enforce quality with zero manual effort. Memory carries decisions forward.

The shift is not about prompting better. It is about building a system where Claude starts every session already knowing what you need.

See also: §9.10 Continuous Improvement Mindset for evolving this system over time. Ready to choose the right mechanism? §2.7 Configuration Decision Guide maps all seven mechanisms with a decision tree.

Communicating Effectively

Good prompt:

The login function in src/auth/login.ts isn't validating email addresses properly.
Plus signs should be allowed but they're being rejected.

Weak prompt:

Login is broken

The more context you provide, the better Claude can help.

2.7 Configuration Decision Guide

Seven configuration mechanisms power Claude Code — knowing which one to reach for saves hours of trial-and-error. This guide gives you the mental shortcuts.

Detailed coverage: §3 Memory & Settings · §4 Agents · §5 Skills · §6 Commands · §7 Hooks · §8 MCP Servers

Semantic Roles

Role	Mechanism	One-liner
What Claude always knows	CLAUDE.md + `rules/*.md`	Permanent context, loaded every session
How Claude executes workflows	Commands (`.claude/commands/`)	Step-by-step SOPs invoked on demand
What Claude can’t bypass	Hooks (`.claude/hooks/`)	Automatic guardrails, zero token cost
What Claude delegates	Agents (`.claude/agents/`)	Isolated parallel workers with scoped context
Shared domain knowledge	Skills (`.claude/skills/`)	Reusable modules inherited by agents
External system access	MCP Servers	APIs, databases, tools via protocol

Mechanism Comparison

Mechanism	When Loaded	Best For	Token Cost	Reliability
CLAUDE.md	Every session	Core conventions, identity	Always paid	100%
*rules/.md**	Every session	Supplementary standing rules	Always paid	100%
Commands	On invocation	Repeatable multi-step workflows	Low (template)	100% when invoked
Hooks	On events	Guardrails, automation, enforcement	Zero	100% (shell scripts)
Agents	On spawn	Isolated / parallel analysis	High (full context)	100% when spawned
Skills	On invocation	Domain knowledge for agents	Medium	~56% auto-invocation
MCP Servers	Session start	External APIs and tools	Connection overhead	100% when connected

Decision Tree: Which Mechanism?

Is this needed every session, for every task?
├─ Yes → CLAUDE.md (core) or rules/*.md (supplementary)
│
└─ No → Should it trigger automatically without user action?
         ├─ Yes → HOOK (event-driven, shell script)
         │
         └─ No → Does it need external system access (API, DB, tool)?
                  ├─ Yes → MCP SERVER
                  │
                  └─ No → Is it a repeatable workflow with defined steps?
                           ├─ Yes → COMMAND (.claude/commands/)
                           │
                           └─ No → Does it need isolated context or parallel work?
                                    ├─ Yes → AGENT (.claude/agents/)
                                    │
                                    └─ No → Is it shared knowledge for multiple agents?
                                             ├─ Yes → SKILL (.claude/skills/)
                                             │
                                             └─ No → Add to CLAUDE.md

The 56% Reliability Warning

Skills are invoked on demand — and agents don’t always invoke them. One evaluation found agents triggered skills in only 56% of cases (Gao, 2026).

Practical implications:

Never put critical instructions only in skills — they may be silently skipped
Safe pattern: CLAUDE.md states what (always loaded), skill provides how in detail (on demand)
For agent workflows, prefer explicit skill invocation in agent frontmatter’s skills: field

See also: §3.4 Precedence Rules for load order and §5.1 Understanding Skills for the full skills decision tree.

Common Mistakes

Mistake	Why It Fails	Fix
Critical rules only in skills	44% chance of being skipped	Move to CLAUDE.md or rules/*.md
Everything in CLAUDE.md	Context window bloat every session	Split: permanent → CLAUDE.md, workflows → commands
Hooks for complex logic	Hooks are shell scripts, not Claude	Use hooks for enforcement, commands for multi-step workflows
MCP for simple file ops	Unnecessary overhead	Use built-in file tools; MCP for external systems

2.8 Structured Prompting with XML Tags

XML-structured prompts provide semantic organization for complex requests, helping Claude distinguish between different aspects of your task for clearer understanding and better results.

What Are XML-Structured Prompts?

XML tags act as labeled containers that explicitly separate instruction types, context, examples, constraints, and expected output format.

Basic syntax:

<instruction>
  Your main task description here
</instruction>

<context>
  Background information, project details, or relevant state
</context>

<code_example>
  Reference code or examples to follow
</code_example>

<constraints>
  - Limitation 1
  - Limitation 2
  - Requirement 3
</constraints>

<output>
  Expected format or structure of the response
</output>

Why Use XML Tags?

Benefit	Description
Separation of concerns	Different aspects of the task are clearly delineated
Reduced ambiguity	Claude knows which information serves what purpose
Better context handling	Helps Claude prioritize main instructions over background info
Consistent formatting	Easier to template complex requests
Multi-faceted requests	Complex tasks with multiple requirements stay organized

Common Tags and Their Uses

Core Instruction Tags:

<instruction>Main task</instruction>          <!-- Primary directive -->
<task>Specific subtask</task>                 <!-- Individual action item -->
<question>What should I do about X?</question> <!-- Explicit inquiry -->
<goal>Achieve state Y</goal>                  <!-- Desired outcome -->

Context and Information Tags:

<context>Project uses Next.js 14</context>            <!-- Background info -->
<problem>Users report slow page loads</problem>       <!-- Issue description -->
<background>Migration from Pages Router</background>  <!-- Historical context -->
<state>Currently on feature-branch</state>            <!-- Current situation -->

Code and Example Tags:

<code_example>
  // Existing pattern to follow
  const user = await getUser(id);
</code_example>

<current_code>
  // Code that needs modification
</current_code>

<expected_output>
  // What the result should look like
</expected_output>

Constraint and Rule Tags:

<constraints>
  - Must maintain backward compatibility
  - No breaking changes to public API
  - Maximum 100ms response time
</constraints>

<requirements>
  - TypeScript strict mode
  - 100% test coverage
  - Accessible (WCAG 2.1 AA)
</requirements>

<avoid>
  - Don't use any for types
  - Don't modify the database schema
</avoid>

Practical Examples

Example 1: Code Review with Context

<instruction>
Review this authentication middleware for security vulnerabilities
</instruction>

<context>
This middleware is used in a financial application handling sensitive user data.
We follow OWASP Top 10 guidelines and need PCI DSS compliance.
</context>

<code_example>
async function authenticate(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'No token' });

  const decoded = jwt.verify(token, process.env.JWT_SECRET);
  req.user = decoded;
  next();
}
</code_example>

<constraints>
- Point out any security risks
- Suggest PCI DSS compliant alternatives
- Consider timing attacks and token leakage
</constraints>

<output>
Provide:
1. List of security issues found
2. Severity rating for each (Critical/High/Medium/Low)
3. Specific code fixes with examples
4. Additional security hardening recommendations
</output>

Example 2: Feature Implementation with Examples

<instruction>
Add a rate limiting system to our API endpoints
</instruction>

<context>
Current stack: Express.js + Redis
No rate limiting currently exists
Experiencing API abuse from specific IPs
</context>

<requirements>
- 100 requests per minute per IP for authenticated users
- 20 requests per minute per IP for unauthenticated
- Custom limits for premium users (stored in database)
- Return 429 status with Retry-After header
</requirements>

<code_example>
// Existing middleware pattern we use
app.use(authenticate);
app.use(authorize(['admin', 'user']));
</code_example>

<constraints>
- Must not impact existing API performance
- Redis connection should be reused
- Handle Redis connection failures gracefully
</constraints>

<output>
Provide:
1. Rate limiter middleware implementation
2. Redis configuration
3. Unit tests
4. Documentation for the team
</output>

Example 3: Bug Investigation with State

<task>
Investigate why user sessions are expiring prematurely
</task>

<problem>
Users report being logged out after 5-10 minutes of activity,
but session timeout is configured for 24 hours.
</problem>

<context>
- Next.js 14 App Router with next-auth
- PostgreSQL session store
- Load balanced across 3 servers
- Issue started after deploying v2.3.0 last week
</context>

<state>
Git diff between v2.2.0 (working) and v2.3.0 (broken) shows changes to:
- middleware.ts (session refresh logic)
- auth.config.ts (session strategy)
- database.ts (connection pooling)
</state>

<constraints>
- Don't suggest reverting the deploy
- Production issue, needs quick resolution
- Must maintain session security
</constraints>

<output>
Provide:
1. Root cause hypothesis
2. Files to investigate (in priority order)
3. Debugging commands to run
4. Potential fixes with trade-offs
</output>

Advanced Patterns

Nested Tags for Complex Hierarchy:

<task>
Refactor authentication system
  <subtask priority="high">
    Update user model
    <constraints>
      - Preserve existing user IDs
      - Add migration for email verification
    </constraints>
  </subtask>

  <subtask priority="medium">
    Implement OAuth providers
    <requirements>
      - Google and GitHub OAuth
      - Reuse existing session logic
    </requirements>
  </subtask>
</task>

Multiple Examples with Labels:

<code_example label="current_implementation">
  // Old approach with callback hell
  getUser(id, (user) => {
    getOrders(user.id, (orders) => {
      res.json({ user, orders });
    });
  });
</code_example>

<code_example label="desired_pattern">
  // New async/await pattern
  const user = await getUser(id);
  const orders = await getOrders(user.id);
  res.json({ user, orders });
</code_example>

Conditional Instructions:

<instruction>
Optimize database query performance
</instruction>

<context>
Query currently takes 2.5 seconds for 10,000 records
</context>

<constraints>
  <if condition="PostgreSQL">
    - Use EXPLAIN ANALYZE
    - Consider materialized views
  </if>

  <if condition="MySQL">
    - Use EXPLAIN with query plan analysis
    - Consider query cache
  </if>
</constraints>

When to Use XML-Structured Prompts

Scenario	Recommended?	Why
Simple one-liner requests	❌ No	Overhead outweighs benefit
Multi-step feature implementation	✅ Yes	Separates goals, constraints, examples
Bug investigation with context	✅ Yes	Distinguishes symptoms from environment
Code review with specific criteria	✅ Yes	Clear separation of code, context, requirements
Architecture planning	✅ Yes	Organizes goals, constraints, trade-offs
Quick typo fix	❌ No	Unnecessary complexity

Best Practices

Do’s:

✅ Use descriptive tag names that clarify purpose
✅ Keep tags consistent across similar requests
✅ Combine with CLAUDE.md for project-specific tag conventions
✅ Nest tags logically when representing hierarchy
✅ Use tags to separate “what” from “why” from “how”

Don’ts:

❌ Over-structure simple requests (adds noise)
❌ Mix tag purposes (e.g., constraints inside code examples)
❌ Use generic tags (<tag>, <content>) without clear meaning
❌ Nest too deeply (>3 levels becomes hard to read)

Integration with CLAUDE.md

You can standardize XML tag usage in your project’s CLAUDE.md:

# XML Prompt Conventions

When making complex requests, use this structure:

<instruction>Main task</instruction>

<context>
  Project context and state
</context>

<code_example>
  Reference implementations
</code_example>

<constraints>
  Technical and business requirements
</constraints>

<output>
  Expected deliverables
</output>

## Project-Specific Tags

- `<api_design>` - API endpoint design specifications
- `<accessibility>` - WCAG requirements and ARIA considerations
- `<performance>` - Performance budgets and optimization goals

Combining with Other Features

XML + Plan Mode:

<instruction>Plan the migration from REST to GraphQL</instruction>

<context>
Currently 47 REST endpoints serving mobile and web clients
</context>

<constraints>
- Must maintain REST endpoints during transition (6-month overlap)
- Mobile app can't be force-updated immediately
</constraints>

<output>
Multi-phase migration plan with rollback strategy
</output>

Then use /plan to explore read-only before implementation.

XML + Cost Awareness:

For large requests, structure with XML to help Claude understand scope and estimate token usage:

<instruction>Analyze all TypeScript files for unused imports</instruction>

<scope>
  src/ directory (~200 files)
</scope>

<output_format>
  Summary report only (don't list every file)
</output_format>

This helps Claude optimize the analysis approach and reduce token consumption.

Example Template Library

Create reusable templates in claudedocs/templates/:

claudedocs/templates/code-review.xml:

<instruction>
Review the following code for quality and best practices
</instruction>

<context>
[Describe the component's purpose and architecture context]
</context>

<code_example>
[Paste code here]
</code_example>

<focus_areas>
- Security vulnerabilities
- Performance bottlenecks
- Maintainability issues
- Test coverage gaps
</focus_areas>

<output>
1. Issues found (categorized by severity)
2. Specific recommendations with code examples
3. Priority order for fixes
</output>

Usage:

cat claudedocs/templates/code-review.xml | \
  sed 's/\[Paste code here\]/'"$(cat src/auth.ts)"'/' | \
  claude -p "Process this review request"

Limitations and Considerations

Token overhead: XML tags consume tokens. For simple requests, natural language is more efficient.

Not required: Claude understands natural language perfectly well. Use XML when structure genuinely helps.

Consistency matters: If you use XML tags, be consistent. Mixing styles within a session can confuse context.

Learning curve: Team members need to understand the tag system. Document your conventions in CLAUDE.md.

💡 Pro tip: Start with natural language prompts. Introduce XML structure when:

Requests have 3+ distinct aspects (instruction + context + constraints)

Ambiguity causes Claude to misunderstand your intent

Creating reusable prompt templates

Working with junior developers who need structured communication patterns

Source: DeepTo Claude Code Guide - XML-Structured Prompts

2.8.1 Prompting as Provocation

The Claude Code team internally treats prompts as challenges to a peer, not instructions to an assistant. This subtle shift produces higher-quality outputs because it forces Claude to prove its reasoning rather than simply comply.

Three challenge patterns from the team:

1. The Gatekeeper — Force Claude to defend its work before shipping:

"Grill me on these changes and don't make a PR until I pass your test"

Claude reviews your diff, asks pointed questions about edge cases, and only proceeds when satisfied. This catches issues that passive review misses.

2. The Proof Demand — Require evidence, not assertions:

"Prove to me this works — show me the diff in behavior between main and this branch"

Claude runs both branches, compares outputs, and presents concrete evidence. Eliminates the “trust me, it works” failure mode.

3. The Reset — After a mediocre first attempt, invoke full-context rewrite:

"Knowing everything you know now, scrap this and implement the elegant solution"

This forces a substantive second attempt with accumulated context rather than incremental patches on a weak foundation. The key insight: Claude’s second attempt with full context consistently outperforms iterative fixes.

Why this works: Provocation triggers deeper reasoning paths than polite requests. When Claude must convince rather than comply, it activates more thorough analysis and catches its own shortcuts.

Source: 10 Tips from Inside the Claude Code Team (Boris Cherny thread, Feb 2026)

2.9 Semantic Anchors

LLMs are statistical pattern matchers trained on massive text corpora. Using precise technical vocabulary helps Claude activate the right patterns in its training data, leading to higher-quality outputs.

Why Precision Matters

When you say “clean code”, Claude might generate any of dozens of interpretations. But when you say “SOLID principles with dependency injection following Clean Architecture layers”, you anchor Claude to a specific, well-documented pattern from its training.

Key insight: Technical terms act as GPS coordinates into Claude’s knowledge. The more precise, the better the navigation.

Common Anchors for Claude Code

Vague Term	Semantic Anchor	Why It Helps
”error handling"	"Railway Oriented Programming with Either/Result monad”	Activates functional error patterns
”clean code"	"SOLID principles, especially SRP and DIP”	Targets specific design principles
”good tests"	"TDD London School with outside-in approach”	Specifies test methodology
”good architecture"	"Hexagonal Architecture (Ports & Adapters)“	Names a concrete pattern
”readable code"	"Screaming Architecture with intention-revealing names”	Triggers specific naming conventions
”scalable design"	"CQRS with Event Sourcing”	Activates distributed patterns
”documentation"	"arc42 template structure”	Specifies documentation framework
”requirements"	"EARS syntax for requirements (Easy Approach to Requirements)“	Targets requirement format
”API design"	"REST Level 3 with HATEOAS”	Specifies maturity level
”security"	"OWASP Top 10 mitigations”	Activates security knowledge

How to Use in CLAUDE.md

Add semantic anchors to your project instructions:

# Architecture Principles

Follow these patterns:
- **Architecture**: Hexagonal Architecture (Ports & Adapters) with clear domain boundaries
- **Error handling**: Railway Oriented Programming - never throw, return Result<T, E>
- **Testing**: TDD London School - mock collaborators, test behaviors not implementations
- **Documentation**: ADR (Architecture Decision Records) for significant choices

Combining with XML Tags

Semantic anchors work powerfully with XML-structured prompts (Section 2.8):

<instruction>
  Refactor the user service following Domain-Driven Design (Evans)
</instruction>

<constraints>
  - Apply Hexagonal Architecture (Ports & Adapters)
  - Use Repository pattern for persistence
  - Implement Railway Oriented Programming for error handling
  - Follow CQRS for read/write separation
</constraints>

<quality_criteria>
  - Screaming Architecture: package structure reveals intent
  - Single Responsibility Principle per class
  - Dependency Inversion: depend on abstractions
</quality_criteria>

Semantic Anchors by Domain

Testing:

TDD London School (mockist) vs Chicago School (classicist)
Property-Based Testing (QuickCheck-style)
Mutation Testing (PIT, Stryker)
BDD Gherkin syntax (Given/When/Then)

Architecture:

Hexagonal Architecture (Ports & Adapters)
Clean Architecture (Onion layers)
CQRS + Event Sourcing
C4 Model (Context, Container, Component, Code)

Design Patterns:

Gang of Four patterns (specify: Strategy, Factory, Observer…)
Domain-Driven Design tactical patterns (Aggregate, Repository, Domain Event)
Functional patterns (Monad, Functor, Railway)

Requirements:

EARS (Easy Approach to Requirements Syntax)
User Story Mapping (Jeff Patton)
Jobs-to-be-Done framework
BDD scenarios

💡 Pro tip: When Claude produces generic code, try adding more specific anchors. “Use clean code” → “Apply Martin Fowler’s Refactoring catalog, specifically Extract Method and Replace Conditional with Polymorphism.”

Full catalog: See examples/semantic-anchors/anchor-catalog.md for a comprehensive reference organized by domain.

Source: Concept by Alexandre Soyer. Original catalog: github.com/LLM-Coding/Semantic-Anchors (Apache-2.0)

2.10 Data Flow & Privacy

Important: Everything you share with Claude Code is sent to Anthropic servers. Understanding this data flow is critical for protecting sensitive information.

What Gets Sent to Anthropic

When you use Claude Code, the following data leaves your machine:

Data Type	Example	Risk Level
Your prompts	”Fix the login bug”	Low
Files Claude reads	`.env`, `src/app.ts`	High if contains secrets
MCP query results	SQL query results with user data	High if production data
Command outputs	`env \| grep API` output	Medium
Error messages	Stack traces with file paths	Low

Retention Policies

Configuration	Retention	How to Enable
Default	5 years	(default state - training enabled)
Opt-out	30 days	claude.ai/settings
Enterprise (ZDR)	0 days	Enterprise contract

Immediate action: Disable training data usage to reduce retention from 5 years to 30 days.

Protecting Sensitive Data

1. Block access to sensitive files in .claude/settings.json:

{
  "permissions": {
    "deny": [
      "Read(./.env*)",
      "Edit(./.env*)",
      "Write(./.env*)",
      "Bash(cat .env*)",
      "Bash(head .env*)",
      "Read(./secrets/**)",
      "Read(./**/*.pem)",
      "Read(./**/*.key)",
      "Read(./**/credentials*)"
    ]
  }
}

Warning: permissions.deny has known limitations. See Security Hardening Guide for details.

2. Never connect production databases to MCP servers. Use dev/staging with anonymized data.

3. Use security hooks to block reading of sensitive files (see Section 7.4).

Full guide: For complete privacy documentation including known risks, community incidents, and enterprise considerations, see Data Privacy & Retention Guide.

2.11 Under the Hood

Reading time: 5 minutes Goal: Understand the core architecture that powers Claude Code

This section provides a summary of Claude Code’s internal mechanisms. For the complete technical deep-dive with diagrams and source citations, see the Architecture & Internals Guide.

The Master Loop

At its core, Claude Code is a simple while loop:

┌─────────────────────────────────────────────────────────────┐
│                    MASTER LOOP (simplified)                 │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   Your Prompt                                               │
│       │                                                     │
│       ▼                                                     │
│   ┌────────────────────────────────────────────────────┐    │
│   │   Claude Reasons (no classifier, no router)        │    │
│   └───────────────────────┬────────────────────────────┘    │
│                           │                                 │
│              Tool needed? │                                 │
│                     ┌─────┴─────┐                           │
│                    YES         NO                           │
│                     │           │                           │
│                     ▼           ▼                           │
│              Execute Tool    Text Response (done)           │
│                     │                                       │
│                     └──────── Feed result back to Claude    │
│                                        │                    │
│                               (loop continues)              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Source: Anthropic Engineering Blog

There is no:

Intent classifier or task router
RAG/embedding pipeline
DAG orchestrator
Planner/executor split

The model itself decides when to call tools, which tools to call, and when it’s done.

The Tool Arsenal

Claude Code has 8 core tools:

Tool	Purpose
`Bash`	Execute shell commands (universal adapter)
`Read`	Read file contents (max 2000 lines)
`Edit`	Modify existing files (diff-based)
`Write`	Create/overwrite files
`Grep`	Search file contents (ripgrep-based)
`Glob`	Find files by pattern
`Task`	Spawn sub-agents (isolated context)
`TodoWrite`	Track progress (legacy, see below)

Task Management System

Version: Claude Code v2.1.16+ introduced a new task management system

Claude Code provides two task management approaches:

Feature	TodoWrite (Legacy)	Tasks API (v2.1.16+)
Persistence	Session memory only	Disk storage (`~/.claude/tasks/`)
Multi-session	❌ Lost on session end	✅ Survives across sessions
Dependencies	❌ Manual ordering	✅ Task blocking (A blocks B)
Coordination	Single agent	✅ Multi-agent broadcast
Status tracking	pending/in_progress/completed	pending/in_progress/completed/failed
Description visibility	✅ Always visible	⚠️ TaskGet only (not in TaskList)
Metadata visibility	N/A	❌ Never visible in outputs
Multi-call overhead	None	⚠️ 1 + N calls for N full tasks
Enabled by	Always available	Default since v2.1.19

Tasks API (v2.1.16+)

Available tools:

TaskCreate - Initialize new tasks with hierarchy and dependencies
TaskUpdate - Modify task status, metadata, and dependencies
TaskGet - Retrieve individual task details
TaskList - List all tasks in current task list

Core capabilities:

Persistent storage: Tasks saved to ~/.claude/tasks/<task-list-id>/
Multi-session coordination: Share state across multiple Claude sessions
Dependency tracking: Tasks can block other tasks (task A blocks task B)
Status lifecycle: pending → in_progress → completed/failed
Metadata: Attach custom data (priority, estimates, related files, etc.)

Configuration:

# Enable multi-session task persistence
export CLAUDE_CODE_TASK_LIST_ID="project-name"
claude

# Example: Project-specific task list
export CLAUDE_CODE_TASK_LIST_ID="api-v2-auth-refactor"
claude

⚠️ Important: Use repository-specific task list IDs to avoid cross-project contamination. Tasks with the same ID are shared across all sessions using that ID.

Task schema example:

{
  "id": "task-auth-login",
  "title": "Implement login endpoint",
  "description": "POST /auth/login with JWT token generation",
  "status": "in_progress",
  "dependencies": [],
  "metadata": {
    "priority": "high",
    "estimated_duration": "2h",
    "related_files": ["src/auth/login.ts", "src/middleware/auth.ts"]
  }
}

When to use Tasks API:

Projects spanning multiple coding sessions
Complex task hierarchies with dependencies
Multi-agent coordination scenarios
Need to resume work after context compaction

⚠️ Tasks API Limitations (Critical)

Field visibility constraint:

Tool	Visible Fields	Hidden Fields
`TaskList`	`id`, `subject`, `status`, `owner`, `blockedBy`	`description`, `activeForm`, `metadata`
`TaskGet`	All fields	-

Impact:

Multi-call overhead: Reviewing 10 task descriptions = 1 TaskList + 10 TaskGet calls (11x overhead)
No metadata scanning: Cannot filter/sort by custom fields (priority, estimates, tags) without fetching all tasks individually
Session resumption friction: Cannot glance at all task notes to decide where to resume

Cost example:

# Inefficient (if you need descriptions)
TaskList  # Returns 10 tasks (no descriptions)
TaskGet(task-1), TaskGet(task-2), ..., TaskGet(task-10)  # 10 additional calls

# Total: 11 API calls to review 10 tasks

Workaround patterns:

Hybrid approach (Recommended):
- Use Tasks API for status tracking and dependency coordination
- Maintain markdown files in repo for detailed implementation plans
- Example: docs/plans/auth-refactor.md + Tasks for status
Subject-as-summary pattern:
- Store critical info in subject field (always visible in TaskList)
- Keep description for deep context (fetch on-demand with TaskGet)
- Example subjects: "[P0] Fix login bug (src/auth.ts:45)" vs "Fix bug"
Selective fetching:
- Use TaskList to identify tasks needing attention (status, blockedBy)
- Only call TaskGet for tasks you’re actively working on

Source: Community practitioner feedback (Gang Rui, Jan 2026)

TodoWrite (Legacy)

Tool: TodoWrite - Creates task lists stored in session memory

Capabilities:

Simple task tracking within a single session
Status tracking: pending/in_progress/completed
Lost when session ends or context is compacted

When to use TodoWrite:

Single-session, straightforward implementations
Quick fixes or exploratory coding
Claude Code < v2.1.16
Prefer simplicity over persistence

Migration flag (v2.1.19+):

# Temporarily revert to TodoWrite system
CLAUDE_CODE_ENABLE_TASKS=false claude

# Use new Tasks API (default)
claude

Best Practices

Task hierarchy design:

Project (parent)
└── Feature A (child)
    ├── Component A1 (leaf task)
    │   ├── Implementation
    │   └── Tests (depends on Implementation)
    └── Component A2

Dependency management:

Always define dependencies when creating tasks
Use task IDs (not titles) for dependency references
Verify dependencies with TaskGet before execution

Status transitions:

Mark in_progress when starting work (prevents parallel execution)
Update frequently for visibility
Only mark completed when fully accomplished (tests passing, validated)
Use failed status with error metadata for debugging

Metadata conventions:

{
  "priority": "high|medium|low",
  "estimated_duration": "2h",
  "related_files": ["path/to/file.ts"],
  "related_issue": "https://github.com/org/repo/issues/123",
  "type": "feature|bugfix|refactor|test"
}

Task Lists as Diagnostic Tool

The Diagnostic Principle: When Claude’s task list doesn’t match your intent, the problem isn’t Claude—it’s your instructions.

Task lists act as a mirror for instruction clarity. If you ask Claude to plan a feature and the resulting tasks surprise you, that divergence is diagnostic information:

Your instruction: "Refactor the auth system"

Claude's task list:
- [ ] Read all auth-related files
- [ ] Identify code duplication
- [ ] Extract shared utilities
- [ ] Update imports
- [ ] Run tests

Your reaction: "That's not what I meant—I wanted to switch from session to JWT"

Diagnosis: Your instruction was ambiguous. "Refactor" ≠ "replace".

Divergence patterns and what they reveal:

Divergence Type	What It Means	Fix
Tasks too broad	Instructions lack specificity	Add WHAT, WHERE, HOW, VERIFY
Tasks too narrow	Instructions too detailed, missing big picture	State the goal, not just the steps
Wrong priorities	Context missing about what matters	Add constraints and priorities
Missing tasks	Implicit knowledge not shared	Make assumptions explicit in prompt
Extra tasks	Claude inferred requirements you didn’t intend	Add explicit scope boundaries

Using task divergence as a workflow:

## Step 1: Seed with loose instruction
User: "Improve the checkout flow"

## Step 2: Review Claude's task list (don't execute yet)
Claude generates: [task list]

## Step 3: Compare against your mental model
- Missing: payment retry logic? → Add to instructions
- Unexpected: UI redesign? → Clarify scope (backend only)
- Wrong order: tests last? → Specify TDD approach

## Step 4: Refine and re-plan
User: "Actually, here's what I need: [refined instruction with specifics]"

Pro tip: Run TaskList after initial planning as a sanity check before execution. If more than 30% of tasks surprise you, your prompt needs work. Iterate on the prompt, not the tasks.

Complete Workflow

→ See: Task Management Workflow for:

Task planning phase (decomposition, hierarchy design)
Task execution patterns
Session management and resumption
Integration with TDD and Plan-Driven workflows
TodoWrite migration guide
Patterns, anti-patterns, and troubleshooting

Sources

Official: Claude Code CHANGELOG v2.1.16 - “new task management system with dependency tracking”
Official: System Prompts - TaskCreate (extracted from Claude Code source)
Community: paddo.dev - From Beads to Tasks
Community: llbbl.blog - Two Changes in Claude Code

Context Management

Claude Code operates within a 200K token context window (1M beta available via API — see [200K vs 1M comparison](line 1751)):

Component	Approximate Size
System prompt	5-15K tokens
CLAUDE.md files	1-10K tokens
Conversation history	Variable
Tool results	Variable
Reserved for response	40-45K tokens

When context fills up (~75% in VS Code, ~95% in CLI), older content is automatically summarized. However, research shows this degrades quality (50-70% performance drop on complex tasks). Use /compact proactively at logical breakpoints, or trigger session handoffs at 85% to preserve intent over compressed history. See [Session Handoffs](line 2140) and Auto-Compaction Research.

Sub-Agent Isolation

The Task tool spawns sub-agents with:

Their own fresh context window
Access to the same tools (except Task itself)
Maximum depth of 1 (cannot spawn sub-sub-agents)
Only their summary text returns to the main context

This prevents context pollution during exploratory tasks.

TeammateTool (Experimental)

Status: Partially feature-flagged, progressive rollout in progress.

TeammateTool enables multi-agent orchestration with persistent communication between agents. Unlike standard sub-agents that work in isolation, teammates can coordinate through structured messaging.

Core Capabilities:

Operation	Purpose
`spawnTeam`	Create a named team of agents
`discoverTeams`	List available teams
`requestJoin`	Agent requests to join a team
`approveJoin`	Team leader approves join requests
Messaging	JSON-based inter-agent communication

Execution Backends (auto-detected):

In-process: Async tasks in same Node.js process (fastest)
tmux: Persistent terminal sessions (survives disconnects)
iTerm2: Visual split panes (macOS only)

Patterns:

Parallel Specialists Pattern:
Leader spawns 3 teammates → Each reviews different aspect (security, perf, architecture)
→ Teammates work concurrently → Report back to leader → Leader synthesizes

Swarm Pattern:
Leader creates shared task queue → Teammates self-organize and claim tasks
→ Independent execution → Async updates to shared state

Limitations:

5-minute heartbeat timeout before auto-removal
Cannot cleanup teams while teammates are active
Feature flags not officially documented (community-discovered)
No official Anthropic support for experimental features

When to Use:

Large codebases requiring parallel analysis (4+ aspects)
Long-running workflows with independent sub-tasks
Code reviews with multiple specialized concerns

When NOT to Use:

Simple tasks (overhead not justified)
Sequential dependencies (standard sub-agents sufficient)
Production-critical workflows (experimental = unstable)

Sources:

Community: kieranklaassen - TeammateTool Guide
Community: GitHub Issue #3013 - Parallel Agent Execution
Community: mikekelly/claude-sneakpeek - Parallel build with feature flags enabled

⚠️ Note: This is an experimental feature. Capabilities may change or be removed in future releases. Always verify current behavior with official documentation.

Agent Anti-Patterns: Roles vs Context Control

“Subagents are not for anthropomorphizing roles, they are for controlling context” - Dex Horty

Common Mistake: Creating agents as if building a human team with job titles.

❌ Wrong (Anthropomorphizing):

- Frontend Agent (role: UI developer)
- Backend Agent (role: API engineer)
- QA Agent (role: tester)
- Security Agent (role: security expert)

Why this fails: Agents aren’t humans with expertise areas. They’re context isolation tools for computational efficiency.

✅ Right (Context Control):

- Agent for isolated dependency analysis (scope: package.json + lock files only)
- Agent for parallel file processing (scope: batch edits without main context pollution)
- Agent for fresh security audit (scope: security-focused analysis without prior assumptions)
- Agent for independent module testing (scope: test execution without interfering with main workflow)

Key differences:

Anthropomorphizing (Wrong)	Context Control (Right)
“Security expert agent"	"Security audit with isolated context"
"Frontend developer agent"	"UI component analysis (scope: src/components/ only)"
"Code reviewer agent"	"PR review without main context pollution”
Mimics human team structure	Optimizes computational resources
Based on job roles	Based on scope/context boundaries

When to use agents (good reasons):

Isolate context: Prevent pollution of main conversation context
Parallel processing: Independent operations that can run concurrently
Scope limitation: Restrict analysis to specific files/directories
Fresh perspective: Analyze without baggage from previous reasoning
Resource optimization: Offload heavy operations to separate context window

When NOT to use agents (bad reasons):

❌ Creating a fake team with job titles
❌ Roleplaying different “expertise” personas
❌ Mimicking human organizational structure
❌ Splitting work by discipline (frontend/backend/QA) instead of by context boundaries

Scope-Focused Agents

Beyond generic sub-agents, scope-focused orchestration assigns distinct context boundaries to different agents for multi-perspective analysis.

The Pattern: Instead of one agent reviewing everything, spawn scope-isolated agents that each analyze distinct aspects with fresh context:

User: Review the new payment service using scope-focused analysis:

Agent 1 (Security Scope): Analyze authentication, input validation,
  injection vectors, secret handling, PCI DSS compliance.
  Context: src/payment/, src/auth/, config/security.yml

Agent 2 (Performance Scope): Analyze database queries, N+1 problems,
  caching opportunities, response time bottlenecks.
  Context: src/payment/repository/, src/database/, slow query logs

Agent 3 (API Design Scope): Analyze error messages, response format
  consistency, API discoverability, documentation completeness.
  Context: src/payment/api/, docs/api/, tests/integration/

Synthesize all three scoped analyses into a unified review with
prioritized action items.

Implementation with Custom Agents:

---
name: security-audit
model: opus
tools: Read, Grep, Glob
---
Analyze code for security issues with isolated context:
- OWASP Top 10 vulnerabilities
- Authentication/authorization flaws
- Input validation gaps
- Secret exposure risks

Scope: Security-focused analysis only. Report findings with severity
ratings (Critical/High/Medium/Low) without considering performance
or UX trade-offs.

---
name: perf-audit
model: sonnet
tools: Read, Grep, Glob, Bash
---
Analyze code for performance bottlenecks with isolated context:
- Database query efficiency (N+1, missing indexes)
- Memory leaks and resource management
- Caching opportunities
- Algorithmic complexity issues

Scope: Performance-focused analysis only. Report findings with estimated
impact (High/Medium/Low) without considering security or maintainability
trade-offs.

When to use scope-focused agents:

Analysis requiring 3+ distinct context boundaries (security scope, perf scope, API scope)
Competing concerns that benefit from isolated evaluation (performance vs. security vs. DX)
Large codebases where full context would pollute analysis of specific aspects

When NOT to use scope-focused agents:

Simple reviews (one agent with full context covers all aspects)
Time-constrained situations (overhead of synthesis outweighs benefit)
Tasks where scopes aren’t genuinely independent (overlapping context needed)

The Philosophy

“Do more with less. Smart architecture choices, better training efficiency, and focused problem-solving can compete with raw scale.” — Daniela Amodei, Anthropic CEO

Claude Code trusts the model’s reasoning instead of building complex orchestration systems. This means:

Fewer components = fewer failure modes
Model-driven decisions = better generalization
Simple loop = easy debugging

Learn More

Topic	Where
Full architecture details	Architecture & Internals Guide
Permission system	Section 7 - Hooks
MCP integration	Section 8.6 - MCP Security
Context management tips	Section 2.2

3. Memory & Settings

Quick jump: Memory Files (CLAUDE.md) · .claude/ Folder Structure · Settings & Permissions · Precedence Rules