Skip to content
Code Guide

2. Core Workflow

What you’ll learn: The mental model and critical workflows for Claude Code mastery.

  • Interaction Loop: Describe → Analyze → Review → Accept/Reject cycle
  • Context Management 🔴 CRITICAL: Watch Ctx(u): — /compact at 70%, /clear at 90%
  • Plan Mode: Read-only exploration before making changes
  • Rewind: Undo with Esc×2 or /rewind
  • Mental Model: Claude = expert pair programmer, not autocomplete

Always check context % before starting complex tasks. High context = degraded quality.

Read this section if: You want to avoid the #1 mistake (context overflow) Skip if: You just need quick command reference (go to Section 10)


Reading time: 20 minutes

Skill level: Day 1-3

Goal: Understand how Claude Code thinks

Every Claude Code interaction follows this pattern:

┌─────────────────────────────────────────────────────────┐
│ INTERACTION LOOP │
├─────────────────────────────────────────────────────────┤
│ │
│ 1. DESCRIBE ──→ You explain what you need │
│ │ │
│ ▼ │
│ 2. ANALYZE ──→ Claude explores the codebas │
│ │ │
│ ▼ │
│ 3. PROPOSE ──→ Claude suggests changes (diff) │
│ │ │
│ ▼ │
│ 4. REVIEW ──→ You read and evaluate │
│ │ │
│ ▼ │
│ 5. DECIDE ──→ Accept / Reject / Modify │
│ │ │
│ ▼ │
│ 6. VERIFY ──→ Run tests, check behavior │
│ │ │
│ ▼ │
│ 7. COMMIT ──→ Save changes (optional) │
│ │
└─────────────────────────────────────────────────────────┘

The loop is designed so that you remain in control. Claude proposes, you decide.

🔴 This is the most important concept in Claude Code.

The zones:

  • 🟢 0-50%: Work freely
  • 🟡 50-75%: Be selective
  • 🔴 75-90%: /compact now
  • ⚫ 90%+: /clear required

When context is high:

  1. /compact (saves context, frees space)
  2. /clear (fresh start, loses history)

Prevention: Load only needed files, compact regularly, commit frequently


Context is Claude’s “working memory” for your conversation. It includes:

  • All messages in the conversation
  • Files Claude has read
  • Command outputs
  • Tool results

Claude has a 200,000 token context window. Think of it like RAM - when it fills up, things slow down or fail.

The statusline shows your context usage:

Claude Code │ Ctx(u): 45% │ Cost: $0.23 │ Session: 1h 23m
MetricMeaning
Ctx(u): 45%You’ve used 45% of context
Cost: $0.23API cost so far
Session: 1h 23mTime elapsed

The default statusline can be enhanced with more detailed information like git branch, model name, and file changes.

Option 1: ccstatusline (recommended)

Add to ~/.claude/settings.json:

{
"statusLine": {
"type": "command",
"command": "npx -y ccstatusline@latest",
"padding": 0
}
}

This displays: Model: Sonnet 4.6 | Ctx: 0 | ⎇ main | (+0,-0) | Cost: $0.27 | Session: 0m | Ctx(u): 0.0%

Option 2: Custom script

Create your own script that:

  1. Reads JSON data from stdin (model, context, cost, git info)
  2. Outputs a single formatted line to stdout
  3. Supports ANSI colors for styling
{
"statusLine": {
"type": "command",
"command": "/path/to/your/statusline-script.sh",
"padding": 0
}
}

Use /statusline command in Claude Code to auto-generate a starter script.

ZoneUsageAction
🟢 Green0-50%Work freely
🟡 Yellow50-75%Start being selective
🔴 Red75-90%Use /compact or /clear
⚫ Critical90%+Must clear or risk errors

When context gets high:

Option 1: Compact (/compact)

  • Summarizes the conversation
  • Preserves key context
  • Reduces usage by ~50%

Option 2: Clear (/clear)

  • Starts fresh
  • Loses all context
  • Use when changing topics

Option 3: Summarize from here (v2.1.32+)

  • Use /rewind (or Esc + Esc) to open the checkpoint list
  • Select a checkpoint and choose “Summarize from here”
  • Claude summarizes everything from that point forward, keeping earlier context intact
  • Frees space while keeping critical context
  • More precise than full /compact

Option 4: Targeted Approach

  • Be specific in queries
  • Avoid “read the entire file”
  • Use symbol references: “read the calculateTotal function”

When approaching the red zone (75%+), /compact alone may not be enough. You need to actively decide what information to preserve before compacting.

Priority: Keep

KeepWhy
CLAUDE.md contentCore instructions must persist
Files being actively editedCurrent work context
Tests for the current componentValidation context
Critical decisions madeArchitectural choices
Error messages being debuggedProblem context

Priority: Evacuate

EvacuateWhy
Files read but no longer relevantOne-time lookups
Debug output from resolved issuesHistorical clutter
Long conversation historySummarized by /compact
Files from completed tasksNo longer needed
Large config filesCan be re-read if needed

Pre-Compact Checklist:

  1. Document critical decisions in CLAUDE.md or a session note
  2. Commit pending changes to git (creates restore point)
  3. Note the current task explicitly (“We’re implementing X”)
  4. Run /compact to summarize and free space

Pro tip: If you know you’ll need specific information post-compact, tell Claude explicitly: “Before we compact, remember that we decided to use Strategy A for authentication because of X.” Claude will include this in the summary.

Claude Code has three distinct memory systems. Understanding the difference is crucial for effective long-term work:

AspectSession MemoryAuto-Memory (native)Persistent Memory (Serena)
ScopeCurrent conversation onlyAcross sessions, per-projectAcross all sessions
Managed by/compact, /clear/memory command (automatic)write_memory() via Serena MCP
Lost whenSession ends or /clearExplicitly deleted via /memoryExplicitly deleted from Serena
RequiresNothingNothing (v2.1.59+)Serena MCP server
Use caseImmediate working contextKey decisions, context snippetsArchitectural decisions, patterns

Session Memory (short-term):

  • Everything in your current conversation
  • Files Claude has read, commands run, decisions made
  • Managed with /compact (compress) and /clear (reset)
  • Disappears when you close Claude Code

Auto-Memory (native, v2.1.59+):

  • Built into Claude Code — no MCP server or configuration required
  • Claude automatically saves useful context (decisions, patterns, preferences) to MEMORY.md files
  • Organized per-project: .claude/memory/MEMORY.md or ~/.claude/projects/<path>/memory/MEMORY.md
  • Managed with /memory: view, edit, or delete what’s been saved
  • Survives across sessions automatically

Persistent Memory (long-term, Serena MCP):

  • Requires Serena MCP server installed
  • Explicitly saved with write_memory("key", "value")
  • Survives across sessions
  • Ideal for: architectural decisions, API patterns, coding conventions

Pattern: End-of-Session Save

# Before ending a productive session:
"Save our authentication decision to memory:
- Chose JWT over sessions for scalability
- Token expiry: 15min access, 7d refresh
- Store refresh tokens in httpOnly cookies"
# Claude calls: write_memory("auth_decisions", "...")
# Next session:
"What did we decide about authentication?"
# Claude calls: read_memory("auth_decisions")

When to use which:

  • Session memory: Active problem-solving, debugging, exploration
  • Auto-memory: Decisions and context you want Claude to rediscover next session without manual effort (v2.1.59+)
  • Persistent memory (Serena): Structured key-value store for architectural decisions across many projects
  • CLAUDE.md: Team conventions, project structure (versioned with git)

Research shows LLM performance degrades significantly with accumulated context:

  • 20-30% performance gap between focused and polluted prompts (Chroma, 2025)
  • Degradation starts at ~16K tokens for Claude models
  • Failed attempts, error traces, and iteration history dilute attention

Instead of managing context within a session, you can restart with a fresh session per task while persisting state externally.

Terminal window
# Canonical "Ralph Loop" (Geoffrey Huntley)
while :; do cat TASK.md PROGRESS.md | claude -p ; done

State persists via:

  • TASK.md — Current task definition with acceptance criteria
  • PROGRESS.md — Learnings, completed tasks, blockers
  • Git commits — Each iteration commits atomically

Variant: tasks/lessons.md

A lightweight alternative for interactive sessions (no loop required): after each user correction, Claude updates tasks/lessons.md with the rule to avoid the same mistake. Reviewed at the start of each new session.

tasks/
├── todo.md # Current plan (checkable items)
└── lessons.md # Rules accumulated from corrections

The difference from PROGRESS.md: lessons.md captures behavioral rules (“always diff before marking done”, “never mock without asking”) rather than task state. It compounds over time — the mistake rate drops as the ruleset grows.

TraditionalFresh Context
Accumulate in chat historyReset per task
/compact to compressState in files + git
Context bleeds across tasksEach task gets full attention
SituationUse
Context 70-90%, staying interactive/compact
Context 90%+, need fresh start/clear then continue
Long autonomous run, task-basedFresh Context Pattern
Overnight/AFK executionFresh Context Pattern

Good fit:

  • Autonomous sessions >1 hour
  • Migrations, large refactorings
  • Tasks with clear success criteria (tests pass, build succeeds)

Poor fit:

  • Interactive exploration
  • Design without clear spec
  • Tasks with slow/ambiguous feedback loops

Variant: Session-per-Concern Pipeline

Instead of looping the same task, dedicate a fresh session to each quality dimension:

  1. Plan session — Architecture, scope, acceptance criteria
  2. Test session — Write unit, integration, and E2E tests first (TDD)
  3. Implement session — Code until all linters and tests pass
  4. Review sessions — Separate sessions for security audit, performance, code review
  5. Repeat — Iterate with scope adjustments as needed

This combines Fresh Context (clean 200K per phase) with OpusPlan (Opus for review/strategy sessions, Sonnet for implementation). Each session generates progress artifacts that feed the next.

Option 1: Manual loop

Terminal window
# Simple fresh-context loop
for i in {1..10}; do
echo "=== Iteration $i ==="
claude -p "$(cat TASK.md PROGRESS.md)"
git diff --stat # Check progress
read -p "Continue? (y/n) " -n 1 -r
[[ ! $REPLY =~ ^[Yy]$ ]] && break
done

Option 2: Script (see examples/scripts/fresh-context-loop.sh)

Terminal window
./fresh-context-loop.sh 10 TASK.md PROGRESS.md

Option 3: External orchestrators

  • AFK CLI — Zero-config orchestration across task sources
TASK.md
## Current Focus
[Single atomic task with clear deliverable]
## Acceptance Criteria
- [ ] Tests pass
- [ ] Build succeeds
- [ ] [Specific verification]
## Context
- Related files: [paths]
- Constraints: [rules]
## Do NOT
- Start other tasks
- Refactor unrelated code

/compact preserves conversation flow. Fresh context maximizes per-task attention at the cost of continuity.

Sources: Chroma Research - Context Rot | Ralph Loop Origin | METR - Long Task Capability | Anthropic - Context Engineering

ActionContext Cost
Reading a small fileLow (~500 tokens)
Reading a large fileHigh (~5K+ tokens)
Running commandsMedium (~1K tokens)
Multi-file searchHigh (~3K+ tokens)
Long conversationsAccumulates

Learn to recognize when context is running out:

SymptomSeverityAction
Shorter responses than usual🟡 WarningContinue with caution
Forgetting CLAUDE.md instructions🟠 SeriousDocument state, prepare checkpoint
Inconsistencies with earlier conversation🔴 CriticalNew session needed
Errors on code already discussed🔴 CriticalNew session needed
”I can’t access that file” (when it was read)🔴 CriticalNew session immediately

Check your context usage in detail:

/context

Example output:

┌─────────────────────────────────────────────────────────────┐
│ CONTEXT USAGE 67% used │
├─────────────────────────────────────────────────────────────┤
│ System Prompt ████████░░░░░░░░░░░░░░░░ 12,450 tk │
│ System Tools ██░░░░░░░░░░░░░░░░░░░░░░ 3,200 tk │
│ MCP Tools (5 servers) ████████████░░░░░░░░░░░░ 18,600 tk │
│ Conversation ████████████████████░░░░ 89,200 tk │
├─────────────────────────────────────────────────────────────┤
│ TOTAL 123,450 tk │
│ REMAINING 76,550 tk │
└─────────────────────────────────────────────────────────────┘

💡 The Last 20% Rule: Reserve ~20% of context for:

  • Multi-file operations at end of session
  • Last-minute corrections
  • Generating summary/checkpoint

Claude Code isn’t free - you’re using API credits. Understanding costs helps optimize usage.

The default model depends on your subscription: Max/Team Premium subscribers get Opus 4.6 by default, while Pro/Team Standard subscribers get Sonnet 4.6. If Opus usage hits the plan threshold, it auto-falls back to Sonnet.

ModelInput (per 1M tokens)Output (per 1M tokens)Context WindowNotes
Sonnet 4.6$3.00$15.00200K tokensDefault model (Feb 2026)
Sonnet 4.5$3.00$15.00200K tokensLegacy (same price)
Opus 4.6 (standard)$5.00$25.00200K tokensReleased Feb 2026
Opus 4.6 (1M context beta)$10.00$37.501M tokensRequests >200K context
Opus 4.6 (fast mode)$30.00$150.00200K tokens2.5x faster, 6x price
Haiku 4.5$0.80$4.00200K tokensBudget option

Reality check: A typical 1-hour session costs $0.10 - $0.50 depending on usage patterns.

Model deprecations (Feb 2026): claude-3-haiku-20240307 (Claude 3 Haiku) was deprecated on February 19, 2026 with retirement scheduled for April 20, 2026. If your CLAUDE.md, agent definitions, or scripts hardcode this model ID, migrate to claude-haiku-4-5-20251001 (Haiku 4.5) before April 2026. Source: platform.claude.com/docs/model-deprecations

200K vs 1M Context: Performance, Cost & Use Cases

Section titled “200K vs 1M Context: Performance, Cost & Use Cases”

The 1M context window (beta, API + usage tier 4 required) is a significant capability jump — but community feedback consistently frames it as a niche premium tool, not a default.

Retrieval accuracy at scale (MRCR v2 8-needle 1M variant)

Model256K accuracy1M accuracySource
Opus 4.693%76%Anthropic blog + independent analysis (Feb 2026)
Sonnet 4.518.5%Anthropic blog (Feb 2026)
Sonnet 4.6Not yet publishedNot yet published

The benchmark is the “8-needle 1M variant” — finding 8 specific facts in a 1M-token document. Opus 4.6 drops from 93% to 76% when scaling from 256K to 1M; Sonnet 4.5 collapses to 18.5%. Community validation: a developer loaded ~733K tokens (4 Harry Potter books) and Opus 4.6 retrieved 49/50 documented spells in a single prompt (HN, Feb 2026). Sonnet 4.6 MRCR not yet published, but community reports suggest it “struggles with following specific instructions and retrieving precise information” at full 1M context.

Cost per session (approximate)

Above 200K input tokens, all tokens in the request are charged at premium rates — not just the excess. Applies to both Sonnet 4.6 and Opus 4.6.

Session type~Tokens in~Tokens outSonnet 4.6Opus 4.6
Bug fix / PR review (≤200K)50K5K~$0.23~$0.38
Module refactoring (≤200K)150K20K~$0.75~$1.25
Full service analysis (>200K, 1M beta)500K50K~$4.13~$6.88

For comparison: Gemini 1.5 Pro offers a 2M context window at $3.50/$10.50/MTok — significantly cheaper for pure long-context RAG. Community advice: use Gemini for large-document RAG, Claude for reasoning quality and agentic workflows.

When to use which

ScenarioRecommendation
Bug fix, PR review, daily codingSonnet 4.6 @ 200K — fast and cheap
Full-repo audit, entire codebase loadOpus 4.6 @ 1M — worth the cost for precision
Cross-module refactoringSonnet 4.6 @ 1M — but weigh cost vs. chunking + RAG
Architecture analysis, Agent TeamsOpus 4.6 @ 1M — strongest retrieval at scale
Large-document RAG (PDFs, legal, books)Consider Gemini 1.5 Pro — cheaper at this scale

Key facts

  • Opus 4.6 max output: 128K tokens; Sonnet 4.6 max output: 64K tokens
  • 1M context ≈ 30,000 lines of code / 750,000 words
  • 1M context is beta — requires anthropic-beta: context-1m-2025-08-07 header, usage tier 4 or custom rate limits
  • Above 200K input tokens: Sonnet 4.6 doubles to $6/$22.50/MTok; Opus 4.6 doubles to $10/$37.50/MTok
  • If input stays ≤200K, standard pricing applies even with the beta flag enabled
  • Practical workaround: check context at ~70% and open a new session rather than hitting compaction (HN pattern)
  • Community consensus: 200K + RAG is the default; 1M Opus is reserved for cases where loading everything at once is genuinely necessary
ActionTokens ConsumedEstimated Cost
Read a 100-line file~500$0.0015
Read 10 files (1000 lines)~5,000$0.015
Long conversation (20 messages)~30,000$0.090
MCP tool call (Serena, Context7)~2,000$0.006
Running tests (with output)~3,000-10,000$0.009-$0.030
Code generation (100 lines)~2,000 output$0.030

The expensive operations:

  1. Reading entire large files - 2000+ line files add up fast
  2. Multiple MCP server calls - Each server adds ~2K tokens overhead
  3. Long conversations without /compact - Context accumulates
  4. Repeated trial and error - Each iteration costs

Strategy 1: Be specific in queries

Terminal window
# ❌ Expensive - reads entire file
"Check auth.ts for issues"
# ~5K tokens if file is large
# ✅ Cheaper - targets specific location
"Check the login function in auth.ts:45-60"
# ~500 tokens

Strategy 2: Use /compact proactively

Terminal window
# Without /compact - conversation grows
Context: 10% 30% 50% 70% 90%
Cost per message increases as context grows
# With /compact at 70%
Context: 10% 30% 50% 70% [/compact] → 30% → 50%
Frees significant context space for subsequent messages

Strategy 3: Choose the right model

Terminal window
# Use Haiku for simple tasks (4x cheaper input, 3.75x cheaper output)
claude --model haiku "Fix this typo in README.md"
# Use Sonnet (default) for standard work
claude "Refactor this module"
# Use Opus only for critical/complex tasks
claude --model opus "Design the entire authentication system"

Strategy 4: Limit MCP servers

// ❌ Expensive - 5 MCP servers loaded
{
"mcpServers": {
"serena": {...},
"context7": {...},
"sequential": {...},
"playwright": {...},
"postgres": {...}
}
}
// ~10K tokens overhead per session
// ✅ Cheaper - load only what you need
{
"mcpServers": {
"serena": {...} // Only for this project
}
}
// ~2K tokens overhead

Strategy 5: Batch operations

Terminal window
# ❌ Expensive - 5 separate prompts
"Read file1.ts"
"Read file2.ts"
"Read file3.ts"
"Read file4.ts"
"Read file5.ts"
# ✅ Cheaper - single batched request
"Read file1.ts, file2.ts, file3.ts, file4.ts, file5.ts and analyze them together"
# Shared context, single response

Strategy 6: Use prompt caching for repeated context (API)

If you call the Anthropic API directly (e.g., for custom agents or pipelines), prompt caching cuts costs by up to 90% on repeated prefixes.

# Mark stable sections with cache_control
response = client.messages.create(
model="claude-sonnet-4-6-20250514",
max_tokens=1024,
system=[
{
"type": "text",
"text": "<your large system prompt / codebase context>",
"cache_control": {"type": "ephemeral"} # Cache this prefix
}
],
messages=[{"role": "user", "content": "Fix the bug in auth.ts"}]
)

Prompt caching economics:

OperationCost multiplierTTL
Cache write1.25x base price5 minutes (default)
Cache write (extended)2x base price1 hour
Cache read (hit)0.1x base price
Latency reductionUp to 85% for long prompts

Break-even: 2 cache hits with 5-minute TTL. After that, pure savings.

Rules:

  • Max 4 cache breakpoints per request
  • Cache key = exact prefix match (single character change = cache miss)
  • Place breakpoints after large stable sections: system prompt, tool definitions, codebase context
  • For Claude Code itself: caching is handled automatically by the CLI — this applies to API-based workflows you build on top of Claude

Docs: prompt caching

Real-time tracking:

The status line shows current session cost:

Claude Code │ Ctx(u): 45% │ Cost: $0.23 │ Session: 1h 23m
↑ Current session cost

Advanced tracking with ccusage:

The ccusage CLI tool provides detailed cost analytics beyond the /cost command:

Terminal window
ccusage # Overview all periods
ccusage --today # Today's costs
ccusage --month # Current month
ccusage --session # Active session breakdown
ccusage --model-breakdown # Cost by model (Sonnet/Opus/Haiku)

Example output:

┌──────────────────────────────────────────────────────┐
│ USAGE SUMMARY - January 2026 │
├──────────────────────────────────────────────────────┤
│ Today $2.34 (12 sessions) │
│ This week $8.91 (47 sessions) │
│ This month $23.45 (156 sessions) │
├──────────────────────────────────────────────────────┤
│ MODEL BREAKDOWN │
│ Sonnet 3.5 85% $19.93 │
│ Opus 4.6 12% $2.81 │
│ Haiku 3.5 3% $0.71 │
└──────────────────────────────────────────────────────┘

Why use ccusage over /cost?

  • Historical trends: Track usage patterns over days/weeks/months
  • Model breakdown: See which model tier drives costs
  • Budget planning: Set monthly spending targets
  • Team analytics: Aggregate costs across developers

For a full inventory of community cost trackers, session viewers, config managers, and alternative UIs, see Third-Party Tools.

Monthly tracking:

Check your Anthropic Console for detailed usage:

Cost budgeting:

Terminal window
# Set a mental budget per session
- Quick task (5-10 min): $0.05-$0.10
- Feature work (1-2 hours): $0.20-$0.50
- Deep refactor (half day): $1.00-$2.00
# If you're consistently over budget:
1. Use /compact more often
2. Be more specific in queries
3. Consider using Haiku for simpler tasks
4. Reduce MCP servers

Perspective on costs: If Claude Code saves you meaningful time on a task, the API cost is usually negligible compared to your hourly rate. Don’t over-optimize for token costs at the expense of productivity.

When to optimize:

  • ✅ You’re on a tight budget (student, hobbyist)
  • ✅ High-volume usage (>4 hours/day)
  • ✅ Team usage (5+ developers)

When NOT to optimize:

  • ❌ Your time is more expensive than API costs
  • ❌ You’re spending more time optimizing than the savings
  • ❌ Optimization hurts productivity (being too restrictive)

For solo developers on a budget:

1. Start with Haiku for exploration/planning
2. Switch to Sonnet for implementation
3. Use /compact aggressively (every 50-60% context)
4. Limit to 1-2 MCP servers
5. Be specific in all queries
6. Batch operations when possible
Monthly cost estimate: $5-$15 for 20-30 hours

For professional developers:

1. Use Sonnet as default (optimal balance)
2. Use /compact when needed (70%+ context)
3. Use full MCP setup (productivity matters)
4. Don't micro-optimize queries
5. Use Opus for critical architectural decisions
Monthly cost estimate: $20-$50 for 40-80 hours

For teams:

1. Shared MCP infrastructure (Context7, Serena)
2. Standardized CLAUDE.md to avoid repeated explanations
3. Agent library to avoid rebuilding patterns
4. CI/CD integration for automation
5. Track costs per developer in Anthropic Console
Monthly cost estimate: $50-$200 for 5-10 developers
IndicatorCauseFix
Sessions consistently >$1Not using /compactSet reminder at 70% context
Cost per message >$0.05Context bloatStart fresh /clear
>$5/day for hobby projectOver-using or inefficient queriesReview query specificity
Haiku failing simple tasksUsing wrong model tierUse Sonnet for anything non-trivial

Note: Anthropic’s plans evolve frequently. Always verify current pricing and limits at claude.com/pricing.

How Subscription Limits Work

Unlike API usage (pay-per-token), subscriptions use a hybrid model that’s deliberately opaque:

ConceptDescription
5-hour rolling windowPrimary limit; resets when you send next message after 5 hours lapse
Weekly aggregate capSecondary limit; resets every 7 days. Both apply simultaneously
Hybrid countingAdvertised as “messages” but actual capacity is token-based, varying by code complexity, file size, and context
Model weightingOpus consumes 8-10× more quota than Sonnet for equivalent work

Approximate Token Budgets by Plan (Jan 2026, community-verified)

Plan5-Hour Token BudgetClaude Code prompts/5hWeekly Sonnet HoursWeekly Opus HoursClaude Code Access
Free0000❌ None
Pro ($20/mo)~44,000 tokens~10-40 prompts40-80 hoursN/A (Sonnet only)✅ Limited
Max 5x ($100/mo)~88,000-220,000 tokens~50-200 prompts140-280 hours15-35 hours✅ Full
Max 20x ($200/mo)~220,000+ tokens~200-800 prompts240-480 hours24-40 hours✅ Full

Warning: These are community-measured estimates. Anthropic does not publish exact token limits, and limits have been reduced without announcement (notably Oct 2025). The 8-10× Opus/Sonnet ratio means Max 20x users get only ~24-40 Opus hours weekly despite paying $200/month. “Prompts/5h” is a rough practical translation of the token budget — actual capacity varies significantly with task complexity, context size, and sub-agent usage. Monthly cap: ~50 active 5-hour windows across all plans.

Why “Hours” Are Misleading

The term “hours of Sonnet 4” refers to elapsed wall-clock time during active processing, not calendar hours. This is not directly convertible to tokens without knowing:

  • Code complexity (larger files = higher per-token overhead)
  • Tool usage (Bash execution adds ~245 input tokens per call; text editor adds ~700)
  • Context re-reads and caching misses

Tier-Specific Strategies

If you have…Recommended approach
Pro planSonnet only; batch sessions, avoid context bloat
Limited Opus quotaOpusPlan essential: Opus for planning, Sonnet for execution
Max 5xSonnet default, Opus only for architecture/complex debugging
Max 20xMore Opus freedom, but still monitor weekly usage (24-40h goes fast)

The Pro User Pattern (validated by community):

1. Opus → Create detailed plan (high-quality thinking)
2. Sonnet/Haiku → Execute the plan (cost-effective implementation)
3. Result: Best reasoning where it matters, lower cost overall

This is exactly what OpusPlan mode does automatically (see Section 2.3).

Monitoring Your Usage

Terminal window
/status # Shows current session: cost, context %, model

Anthropic provides no in-app real-time usage metrics. Community tools like ccusage help track token consumption across sessions.

For subscription usage history: Check your Anthropic Console or Claude.ai settings.

Historical Note: In October 2025, users reported significant undocumented limit reductions coinciding with Sonnet 4.5’s release. Pro users who previously sustained 40-80 Sonnet hours weekly reported hitting limits after only 6-8 hours. Anthropic acknowledged the limits but did not explain the discrepancy.

Definition: When information from one task contaminates another.

Pattern 1: Style Bleeding

Task 1: "Create a blue button"
Claude: [Creates blue button]
Task 2: "Create a form"
Claude: [Creates form... with all buttons blue!]
↑ The "blue" bled into the new task
Solution: Use explicit boundaries
"---NEW TASK---
Create a form. Use default design system colors."

Pattern 2: Instruction Contamination

Instruction 1: "Always use arrow functions"
Instruction 2: "Follow project conventions" (which uses function)
Claude: [Paralyzed, alternating between styles]
Solution: Clarify priority
"In case of conflict, project conventions take precedence over my preferences."

Pattern 3: Temporal Confusion

Early session: "auth.ts contains login logic"
... 2h of work ...
You renamed auth.ts to authentication.ts
Claude: "I'll modify auth.ts..."
↑ Using outdated info
Solution: Explicit updates
"Note: auth.ts was renamed to authentication.ts"

Context Hygiene Checklist:

  • New tasks = explicit markdown boundaries
  • Structural changes = inform Claude explicitly
  • Contradictory instructions = clarify priority
  • Long session (>2h) = consider /clear or new session
  • Erratic behavior = check with /context

Verify that Claude has loaded your configuration correctly.

Simple Method:

  1. Add at the top of CLAUDE.md:
# My name is [Your Name]
# Project: [Project Name]
# Stack: [Your tech stack]
  1. Ask Claude: “What is my name? What project am I working on?”

  2. If correct → Configuration loaded properly

Advanced: Multiple Checkpoints

# === CHECKPOINT 1 === Project: MyApp ===
[... 500 lines of instructions ...]
# === CHECKPOINT 2 === Stack: Next.js ===
[... 500 lines of instructions ...]
# === CHECKPOINT 3 === Owner: [Name] ===

Ask “What is checkpoint 2?” to verify Claude read that far.

Failure SymptomProbable CauseSolution
Doesn’t know your nameCLAUDE.md not loadedCheck file location
Inconsistent answersTypo in filenameMust be CLAUDE.md (not clause.md)
Partial knowledgeContext exhausted/clear or new session

When ending a session or switching contexts, create a handoff document to maintain continuity.

Purpose: Bridge the gap between sessions by documenting state, decisions, and next steps.

Template:

# Session Handoff - [Date] [Time]
## What Was Accomplished
- [Key task 1 completed]
- [Key task 2 completed]
- [Files modified: list]
## Current State
- [What's working]
- [What's partially done]
- [Known issues or blockers]
## Decisions Made
- [Architectural choice 1: why]
- [Technology selection: rationale]
- [Trade-offs accepted]
## Next Steps
1. [Immediate next task]
2. [Dependent task]
3. [Follow-up validation]
## Context for Next Session
- Branch: [branch-name]
- Key files: [list 3-5 most relevant]
- Dependencies: [external factors]

When to create handoff documents:

ScenarioWhy
End of work dayResume seamlessly tomorrow
Before context limitPreserve state before /clear
Switching focus areasDifferent task requires fresh context
Interruption expectedEmergency or meeting disrupts work
Complex debuggingDocument hypotheses and tests tried

Storage location: claudedocs/handoffs/handoff-YYYY-MM-DD.md

Pro tip: Ask Claude to generate the handoff:

You: "Create a session handoff document for what we accomplished today"

Claude will analyze git status, conversation history, and generate a structured handoff.

Plan Mode is Claude Code’s “look but don’t touch” mode.

/plan

Or ask Claude directly:

You: Let's plan this feature before implementing
  • ✅ Reading files
  • ✅ Searching the codebase
  • ✅ Analyzing architecture
  • ✅ Proposing approaches
  • ✅ Writing to a plan file
  • ❌ Editing files
  • ❌ Running commands that modify state
  • ❌ Creating new files
  • ❌ Making commits
SituationUse Plan Mode?
Exploring unfamiliar codebase✅ Yes
Investigating a bug✅ Yes
Planning a new feature✅ Yes
Fixing a typo❌ No
Quick edit to known file❌ No

Recommended frequency: Boris Cherny (Head of Claude Code at Anthropic) starts approximately 80% of tasks in Plan Mode — letting Claude plan before writing a single line of code. Once the plan is approved, execution is almost always correct on the first try. — Lenny’s Newsletter, February 19, 2026

Press Shift+Tab to toggle back to Normal Mode (Act Mode). You can also type a message and Claude will ask: “Ready to implement this plan?”

Note: Shift+Tab toggles between Plan Mode and Normal Mode during a session. Use Shift+Tab twice from Normal Mode to enter Plan Mode, once from Plan Mode to return.

Concept: Automatically trigger planning mode before any risky operation.

Configuration File (~/.claude/auto-plan-mode.txt):

Before executing ANY tool (Read, Write, Edit, Bash, Grep, Glob, WebSearch), you MUST:
1. FIRST: Use exit_plan_mode tool to present your plan
2. WAIT: For explicit user approval before proceeding
3. ONLY THEN: Execute the planned actions
Each new user request requires a fresh plan - previous approvals don't carry over.

Launch with Auto Plan Mode:

macOS/Linux:

Terminal window
# Direct
claude --append-system-prompt "Before executing ANY tool..."
# Via file (recommended)
claude --append-system-prompt "$(cat ~/.claude/auto-plan-mode.txt)"
# Alias in .zshrc/.bashrc
alias claude-safe='claude --append-system-prompt "$(cat ~/.claude/auto-plan-mode.txt)"'

Windows (PowerShell):

Terminal window
# Create the config file at %USERPROFILE%\.claude\auto-plan-mode.txt with the same content
# Direct
claude --append-system-prompt "Before executing ANY tool..."
# Via file (add to $PROFILE)
function claude-safe {
$planPrompt = Get-Content "$env:USERPROFILE\.claude\auto-plan-mode.txt" -Raw
claude --append-system-prompt $planPrompt $args
}

Resulting Workflow:

User: "Add an email field to the User model"
Claude (Auto Plan Mode active):
┌─────────────────────────────────────────────────────────────┐
│ 📋 PROPOSED PLAN │
│ │
│ 1. Read schema.prisma to understand current model │
│ 2. Add field email: String? @unique │
│ 3. Generate Prisma migration │
│ 4. Update TypeScript types │
│ 5. Add Zod validation in routers │
│ │
│ ⚠️ Impact: 3 files modified, 1 migration created │
│ │
│ Approve this plan? (y/n) │
└─────────────────────────────────────────────────────────────┘
User: "y"
Claude: [Executes the plan]

Result: 76% fewer tokens with better results because the plan is validated before execution.

Claude Code supports six model aliases via /model (each always resolves to the latest version):

AliasResolves ToUse Case
defaultLatest model for your plan tierStandard usage
sonnetClaude Sonnet 4.6Fast, cost-efficient
opusClaude Opus 4.6Deep reasoning
haikuClaude Haiku 4.5Budget, high-volume
sonnet[1m]Sonnet with 1M contextLarge codebases
opusplanOpus (plan) + Sonnet (act)Hybrid intelligence

Model can also be set via claude --model <alias>, ANTHROPIC_MODEL env var, or "model" in settings.json. Priority: /model > --model flag > ANTHROPIC_MODEL > settings.json.

Concept: Use Opus for planning (superior reasoning) and Sonnet for implementation (cost-efficient).

Why OpusPlan?

  • Cost optimization: Opus tokens cost more than Sonnet
  • Best of both worlds: Opus-quality planning + Sonnet-speed execution
  • Token savings: Planning is typically shorter than implementation

Activation:

/model opusplan

Or in ~/.claude/settings.json:

{
"model": "opusplan"
}

How It Works:

  1. In Plan Mode (/plan or Shift+Tab twice) → Uses Opus
  2. In Act Mode (normal execution) → Uses Sonnet
  3. Automatic switching based on mode

Recommended Workflow:

1. /model opusplan → Enable OpusPlan
2. Shift+Tab × 2 → Enter Plan Mode (Opus)
3. Describe your task → Get Opus-quality planning
4. Shift+Tab → Exit to Act Mode (Sonnet)
5. Execute the plan → Sonnet implements efficiently

Alternative Approach with Subagents:

You can also control model usage per agent:

.claude/agents/planner.md
---
name: planner
model: opus
tools: Read, Grep, Glob
---
# Strategic Planning Agent
.claude/agents/implementer.md
---
name: implementer
model: haiku
tools: Write, Edit, Bash
---
# Fast Implementation Agent

Pro Users Note: OpusPlan is particularly valuable for Pro subscribers with limited Opus tokens. It lets you leverage Opus reasoning for critical planning while preserving tokens for more sessions.

Budget Variant: SonnetPlan (Community Hack)

opusplan is hardcoded to Opus+Sonnet — there’s no native sonnetplan alias. But you can remap what the opus and sonnet aliases resolve to via environment variables, effectively creating a Sonnet→Haiku hybrid:

Terminal window
# Add to ~/.zshrc
sonnetplan() {
ANTHROPIC_DEFAULT_OPUS_MODEL=claude-sonnet-4-6 \
ANTHROPIC_DEFAULT_SONNET_MODEL=claude-haiku-4-5-20251001 \
claude "$@"
}

With sonnetplan, /model opusplan routes:

  • Plan Mode → Sonnet 4.6 (via remapped opus alias)
  • Act Mode → Haiku 4.5 (via remapped sonnet alias)

Caveat: The model’s self-report (what model are you?) is unreliable — models don’t always know their own identity. Trust the status bar (Model: Sonnet 4.6 in plan mode) or verify via billing dashboard. GitHub issue #9749 tracks native support.

Concept: Run multiple rounds of planning and deep thinking before executing. Like warming up an engine before driving.

Standard workflow: think → plan → execute. Rev the Engine: think → plan → think harder → refine plan → think hardest → finalize → execute.

When to use:

  • Critical architectural decisions (irreversible, high-impact)
  • Complex migrations affecting 10+ files
  • Unfamiliar domain where first instincts are often wrong

Pattern:

## Round 1: Initial analysis
User: /plan
User: Analyze the current auth system. What are the key components,
dependencies, and potential risks of migrating to OAuth2?
Claude: [Initial analysis]
## Round 2: Deep challenge
User: Now use extended thinking. Challenge your own analysis:
- What assumptions did you make?
- What failure modes did you miss?
- What would a senior security engineer flag?
Claude: [Deeper analysis with self-correction]
## Round 3: Final plan
User: Based on both rounds, write the definitive migration plan.
Include rollback strategy and risk mitigation for each step.
Claude: [Refined plan incorporating both rounds]
## Execute
User: /execute
User: Implement the plan from round 3.

Why it works: Each round forces Claude to reconsider assumptions. Round 2 typically catches 30-40% of issues that round 1 missed. Round 3 synthesizes into a more robust plan.

📊 Empirical backing — Anthropic AI Fluency Index (Feb 2026)

An Anthropic study analyzing 9,830 Claude conversations quantifies exactly why plan review works: users who iterate and question the AI’s reasoning are 5.6× more likely to catch missing context and errors compared to users who accept the first output. A second round of review makes you 4× more likely to identify what was left out.

The Rev the Engine pattern operationalizes this finding: each round of deep challenge triggers the questioning behavior that produces measurably better plans.

Source: Swanson et al., “The AI Fluency Index”, Anthropic (2026-02-23) — anthropic.com/research/AI-fluency-index

Concept: Layer multiple Claude Code mechanisms for maximum intelligence on critical decisions.

Layer 1: Plan Mode → Safe exploration, no side effects
Layer 2: Extended Thinking → Deep reasoning with thinking tokens
Layer 3: Rev the Engine → Multi-round refinement
Layer 4: Split-Role Agents → Multi-perspective analysis
Layer 5: Permutation → Systematic variation testing

You don’t need all layers for every task. Match the stack depth to the decision’s impact:

Decision ImpactStack DepthExample
Low (fix typo)0 layersJust do it
Medium (add feature)1-2 layersPlan Mode + Extended Thinking
High (architecture)3-4 layersRev the Engine + Split-Role
Critical (migration)4-5 layersFull stack

Anti-pattern: Stacking on trivial decisions. If the change is reversible and low-risk, just execute. Over-planning is as wasteful as under-planning.

Cross-references:

Rewind is Claude Code’s undo mechanism.

Access via Esc + Esc (double-tap Escape) or the /rewind command. This opens a scrollable checkpoint list.

Rewind provides four distinct actions from the checkpoint list:

ActionEffect
Restore code and conversationRevert both file changes and conversation to selected point
Restore conversationKeep current code, rewind conversation only
Restore codeRevert file changes, keep conversation
Summarize from hereCompress conversation from selected point forward (frees space without reverting)

Key distinction: Restore = undo (reverts state). Summarize = compress (frees space without reverting). Checkpoints persist across sessions (30-day cleanup).

  • Only works on Claude’s changes (not manual edits)
  • Works within the current session
  • Git commits are NOT automatically reverted

Before a risky operation:

You: Let's commit what we have before trying this experimental approach

This creates a git checkpoint you can always return to.

When things go wrong, you have multiple recovery options. Use the lightest-weight approach that solves your problem:

┌─────────────────────────────────────────────────────────┐
│ RECOVERY LADDER │
├─────────────────────────────────────────────────────────┤
│ │
│ Level 3: Git Restore (nuclear option) │
│ ───────────────────────────────────── │
│ • git checkout -- <file> (discard uncommitted) │
│ • git stash (save for later) │
│ • git reset --hard HEAD~1 (undo last commit) │
│ • Works for: Manual edits, multiple sessions │
│ │
│ Level 2: /rewind (session undo) │
│ ───────────────────────────── │
│ • Reverts Claude's recent file changes │
│ • Works within current session only │
│ • Doesn't touch git commits │
│ • Works for: Bad code generation, wrong direction │
│ │
│ Level 1: Reject Change (inline) │
│ ──────────────────────────── │
│ • Press 'n' when reviewing diff │
│ • Change never applied │
│ • Works for: Catching issues before they happen │
│ │
└─────────────────────────────────────────────────────────┘

When to use each level:

ScenarioRecovery LevelCommand
Claude proposed bad codeLevel 1Press n
Claude made changes, want to undoLevel 2/rewind
Changes committed, need full rollbackLevel 3git reset
Experimental branch went wrongLevel 3git checkout main
Context corrupted, strange behaviorFresh start/clear + restate goal

Pro tip: The /rewind command shows a list of changes to undo. You can selectively revert specific files rather than all changes.

For systematic experimentation, use the checkpoint pattern to create safe restore points:

┌─────────────────────────────────────────────────────────┐
│ CHECKPOINT WORKFLOW │
├─────────────────────────────────────────────────────────┤
│ │
│ 1. Create checkpoint │
│ ────────────────── │
│ git stash push -u -m "checkpoint-before-refactor" │
│ (saves all changes including untracked files) │
│ │
│ 2. Experiment freely │
│ ────────────────── │
│ Try risky refactoring, architectural changes, etc. │
│ If it works → commit normally │
│ If it fails → restore checkpoint │
│ │
│ 3. Restore checkpoint │
│ ────────────────── │
│ git stash list # find your checkpoint │
│ git stash apply stash@{0} # restore without delete │
│ # or │
│ git stash pop stash@{0} # restore and delete │
│ │
└─────────────────────────────────────────────────────────┘

Automated checkpoint: Create a Stop hook to auto-checkpoint on session end:

.claude/hooks/auto-checkpoint.sh
# See: examples/hooks/bash/auto-checkpoint.sh
# Automatically creates git stash on session end
# Naming: claude-checkpoint-{branch}-{timestamp}
# Logs to: ~/.claude/logs/checkpoints.log

Common workflows:

ScenarioWorkflow
Risky refactorCheckpoint → Try → Commit or restore
A/B testing approachesCheckpoint → Try A → Restore → Try B → Compare
Incremental migrationCheckpoint → Migrate piece → Test → Repeat
Prototype explorationCheckpoint → Experiment → Discard cleanly

Benefits over branching:

  • Faster than creating feature branches
  • Preserves uncommitted changes
  • Lightweight for quick experiments
  • Works across multiple files

Choosing the right model for each task is the fastest ROI improvement most Claude Code users can make. One decision per task — no overthinking.

Quick jump: Decision Table · Effort Levels · Model per Agent · When Thinking Helps

Cross-references: OpusPlan Mode · Rev the Engine · Cost Awareness


TaskModelEffortEst. cost/task
Rename, format, boilerplateHaikulow~$0.02
Generate unit testsHaikulow~$0.03
CI/CD PR review (volume)Haikulow~$0.02
Feature dev, standard debugSonnetmedium~$0.23
Module refactoringSonnethigh~$0.75
System architectureOpushigh~$1.25
Critical security auditOpusmax~$2+
Multi-agent orchestrationSonnet + Haikumixedvariable

Note on costs: Estimates based on API pricing (Haiku $0.80/$4.00 per MTok, Sonnet $3/$15, Opus $5/$25). Pro/Max subscribers pay a flat rate, so prioritize quality over cost. See Section 2.2 for full pricing breakdown.

Budget modifier (Teams Standard/Pro): downgrade one tier per phase — use Sonnet where the table says Opus, Haiku where it says Sonnet for mechanical implementation tasks. Community pattern: Sonnet for Plan → Haiku for Implementation on a $25/mo Teams Standard plan.


The effort parameter (Opus 4.6 API) controls the model’s overall computational budget — not just thinking tokens, but tool calls, verbosity, and analysis depth. Low effort = fewer tool calls, no preamble. High effort = more explanations, detailed analysis.

Calibrated gradient — one real prompt per level:

  • low — Mechanical, no design decisions needed

    "Rename getUserById to findUserById across src/" — Find-replace scope, zero reasoning required.

  • medium — Clear pattern, defined scope, one concern

    "Convert fetchUser() in api/users.ts from callbacks to async/await" — Pattern is known, scope bounded.

  • high — Design decisions, edge cases, multiple concerns

    "Redesign error handling in the payment module: add retry logic, partial failure recovery, and idempotency guarantees" — Architectural choices, not just pattern application.

  • max (Opus 4.6 only — returns error on other models) — Cross-system reasoning, irreversible decisions

    "Analyze the microservices event pipeline for race conditions across order-service, inventory-service, and notification-service" — Multi-service hypothesis testing, adversarial thinking.


Assign models to agents based on role, not importance:

Planner (examples/agents/planner.md) — Strategy, read-only exploration

---
name: planner
description: Strategic planning agent — read-only. Use before implementation.
model: opus
tools: Read, Grep, Glob
---

Implementer (examples/agents/implementer.md) — Mechanical execution, bounded scope

---
name: implementer
description: Mechanical execution agent. Scope must be defined explicitly in the task.
model: haiku
tools: Write, Edit, Bash, Read, Grep, Glob
---

Note: Haiku is for mechanical tasks only. If the implementation requires design decisions or complex business logic, use Sonnet — state this in the task prompt.

Architecture Reviewer (examples/agents/architecture-reviewer.md) — Critical design review

---
name: architecture-reviewer
description: Architecture and design review — read-only. Never modifies code.
model: opus
tools: Read, Grep, Glob
---

Pro tip: Add a model reminder to your CLAUDE.md:

# Model reminder
Default: Sonnet. Haiku for mechanical tasks. Opus for architecture and security audits.

ScenarioThinkingReason
Rename 50 filesOFFZero reasoning — pure mechanics
Bug spanning 3+ servicesON (high)Multi-layer hypothesis testing
Boilerplate / test generationOFFRepetitive pattern, no decisions
Architecture migrationON (max)Irreversible decisions
Direct factual questionsOFF (low)Immediate answer sufficient
Security code reviewON (high)Adversarial reasoning needed

Toggle: Alt+T (current session) · /config (permanent)


Understanding how Claude Code “thinks” makes you more effective.

┌─────────────────────────────────────────────────────────┐
│ YOUR PROJECT │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌───────────┐ │
│ │ Files │ │ Git │ │ Config │ │
│ │ (.ts,.py) │ │ History │ │ Files │ │
│ └─────────────┘ └─────────────┘ └───────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Claude's Understanding │ │
│ │ - File structure & relationships │ │
│ │ - Code patterns & conventions │ │
│ │ - Recent changes (from git) │ │
│ │ - Project rules (from CLAUDE.md) │ │
│ └─────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
  1. File Structure: Claude can navigate and search your files
  2. Code Content: Claude can read and understand code
  3. Git State: Claude sees branches, commits, changes
  4. Project Rules: Claude reads CLAUDE.md for conventions
  1. Runtime State: Claude can’t see running processes
  2. External Services: Claude can’t access your databases directly
  3. Your Intent: Claude needs clear instructions
  4. Hidden Files: Claude respects .gitignore by default

⚠️ Pattern Amplification: Claude mirrors the patterns it finds. In well-structured codebases, it produces consistent, idiomatic code. In messy codebases without clear abstractions, it perpetuates the mess. If your code lacks good patterns, provide them explicitly in CLAUDE.md or use semantic anchors (Section 2.9).

Think of yourself as a CPU scheduler. Claude Code instances are worker threads. You don’t write the code—you orchestrate the work.

┌─────────────────────────────────────────┐
│ YOU (Main Thread) │
│ ┌────────────────────────────────────┐ │
│ │ Responsibilities: │ │
│ │ • Define tasks and priorities │ │
│ │ • Allocate context budgets │ │
│ │ • Review outputs │ │
│ │ • Make architectural decisions │ │
│ │ • Handle exceptions/escalations │ │
│ └────────────────────────────────────┘ │
│ │ │ │ │
│ ┌────▼───┐ ┌────▼───┐ ┌────▼───┐ │
│ │Worker 1│ │Worker 2│ │Worker 3│ │
│ │(Claude)│ │(Claude)│ │(Claude)│ │
│ │Feature │ │Tests │ │Review │ │
│ └────────┘ └────────┘ └────────┘ │
└─────────────────────────────────────────┘

Implications:

  • Don’t write code when Claude can. Your time is for decisions, not keystrokes.
  • Don’t micromanage. Give clear instructions, then review results.
  • Context-switch deliberately. Like a scheduler, batch similar tasks.
  • Escalate to yourself. When Claude is stuck, step in—then hand back.

This mental model scales: one developer can orchestrate 2-5 Claude instances on independent tasks (see §9.17 Scaling Patterns).

The most common mistake is treating Claude Code like a chatbot — typing ad-hoc requests and hoping for good output. What separates casual usage from production workflows is a shift in thinking:

Chatbot mode: You write good prompts. Context system: You build structured context that makes every prompt better.

“Stop treating it like a chatbot. Give it structured context. CLAUDE.md, hooks, skills, project memory. Changes everything.”Robin Lorenz, AI Engineer (comment)

Claude Code has four layers of persistent context that compound over time:

LayerWhat It DoesSectionWhen to Set Up
CLAUDE.mdPersistent rules, conventions, project knowledge§3.1Week 1
SkillsReusable knowledge modules for consistent workflows§5Week 2
HooksAutomated guardrails (lint, security, formatting)§7Week 2-3
Project memoryCross-session decisions and architectural context§3.1Ongoing

These are not independent features. They are layers of the same system:

  • CLAUDE.md teaches Claude what your project needs (conventions, stack, patterns)
  • Skills teach Claude how to perform specific workflows (review, deploy, test)
  • Hooks enforce guardrails automatically (block secrets, auto-format, run linting)
  • Memory preserves decisions across sessions (architectural choices, resolved tradeoffs)

Before (chatbot mode):

“Use pnpm, not npm. And remember our naming convention is…” (Every session. Every time. Copy-pasting context.)

After (context system):

CLAUDE.md loads conventions automatically. Skills ensure consistent workflows. Hooks enforce quality with zero manual effort. Memory carries decisions forward.

The shift is not about prompting better. It is about building a system where Claude starts every session already knowing what you need.

See also: §9.10 Continuous Improvement Mindset for evolving this system over time. Ready to choose the right mechanism? §2.7 Configuration Decision Guide maps all seven mechanisms with a decision tree.

Good prompt:

The login function in src/auth/login.ts isn't validating email addresses properly.
Plus signs should be allowed but they're being rejected.

Weak prompt:

Login is broken

The more context you provide, the better Claude can help.

Seven configuration mechanisms power Claude Code — knowing which one to reach for saves hours of trial-and-error. This guide gives you the mental shortcuts.

Detailed coverage: §3 Memory & Settings · §4 Agents · §5 Skills · §6 Commands · §7 Hooks · §8 MCP Servers

RoleMechanismOne-liner
What Claude always knowsCLAUDE.md + rules/*.mdPermanent context, loaded every session
How Claude executes workflowsCommands (.claude/commands/)Step-by-step SOPs invoked on demand
What Claude can’t bypassHooks (.claude/hooks/)Automatic guardrails, zero token cost
What Claude delegatesAgents (.claude/agents/)Isolated parallel workers with scoped context
Shared domain knowledgeSkills (.claude/skills/)Reusable modules inherited by agents
External system accessMCP ServersAPIs, databases, tools via protocol
MechanismWhen LoadedBest ForToken CostReliability
CLAUDE.mdEvery sessionCore conventions, identityAlways paid100%
rules/*.mdEvery sessionSupplementary standing rulesAlways paid100%
CommandsOn invocationRepeatable multi-step workflowsLow (template)100% when invoked
HooksOn eventsGuardrails, automation, enforcementZero100% (shell scripts)
AgentsOn spawnIsolated / parallel analysisHigh (full context)100% when spawned
SkillsOn invocationDomain knowledge for agentsMedium~56% auto-invocation
MCP ServersSession startExternal APIs and toolsConnection overhead100% when connected
Is this needed every session, for every task?
├─ Yes → CLAUDE.md (core) or rules/*.md (supplementary)
└─ No → Should it trigger automatically without user action?
├─ Yes → HOOK (event-driven, shell script)
└─ No → Does it need external system access (API, DB, tool)?
├─ Yes → MCP SERVER
└─ No → Is it a repeatable workflow with defined steps?
├─ Yes → COMMAND (.claude/commands/)
└─ No → Does it need isolated context or parallel work?
├─ Yes → AGENT (.claude/agents/)
└─ No → Is it shared knowledge for multiple agents?
├─ Yes → SKILL (.claude/skills/)
└─ No → Add to CLAUDE.md

Skills are invoked on demand — and agents don’t always invoke them. One evaluation found agents triggered skills in only 56% of cases (Gao, 2026).

Practical implications:

  • Never put critical instructions only in skills — they may be silently skipped
  • Safe pattern: CLAUDE.md states what (always loaded), skill provides how in detail (on demand)
  • For agent workflows, prefer explicit skill invocation in agent frontmatter’s skills: field

See also: §3.4 Precedence Rules for load order and §5.1 Understanding Skills for the full skills decision tree.

MistakeWhy It FailsFix
Critical rules only in skills44% chance of being skippedMove to CLAUDE.md or rules/*.md
Everything in CLAUDE.mdContext window bloat every sessionSplit: permanent → CLAUDE.md, workflows → commands
Hooks for complex logicHooks are shell scripts, not ClaudeUse hooks for enforcement, commands for multi-step workflows
MCP for simple file opsUnnecessary overheadUse built-in file tools; MCP for external systems

XML-structured prompts provide semantic organization for complex requests, helping Claude distinguish between different aspects of your task for clearer understanding and better results.

XML tags act as labeled containers that explicitly separate instruction types, context, examples, constraints, and expected output format.

Basic syntax:

<instruction>
Your main task description here
</instruction>
<context>
Background information, project details, or relevant state
</context>
<code_example>
Reference code or examples to follow
</code_example>
<constraints>
- Limitation 1
- Limitation 2
- Requirement 3
</constraints>
<output>
Expected format or structure of the response
</output>
BenefitDescription
Separation of concernsDifferent aspects of the task are clearly delineated
Reduced ambiguityClaude knows which information serves what purpose
Better context handlingHelps Claude prioritize main instructions over background info
Consistent formattingEasier to template complex requests
Multi-faceted requestsComplex tasks with multiple requirements stay organized

Core Instruction Tags:

<instruction>Main task</instruction> <!-- Primary directive -->
<task>Specific subtask</task> <!-- Individual action item -->
<question>What should I do about X?</question> <!-- Explicit inquiry -->
<goal>Achieve state Y</goal> <!-- Desired outcome -->

Context and Information Tags:

<context>Project uses Next.js 14</context> <!-- Background info -->
<problem>Users report slow page loads</problem> <!-- Issue description -->
<background>Migration from Pages Router</background> <!-- Historical context -->
<state>Currently on feature-branch</state> <!-- Current situation -->

Code and Example Tags:

<code_example>
// Existing pattern to follow
const user = await getUser(id);
</code_example>
<current_code>
// Code that needs modification
</current_code>
<expected_output>
// What the result should look like
</expected_output>

Constraint and Rule Tags:

<constraints>
- Must maintain backward compatibility
- No breaking changes to public API
- Maximum 100ms response time
</constraints>
<requirements>
- TypeScript strict mode
- 100% test coverage
- Accessible (WCAG 2.1 AA)
</requirements>
<avoid>
- Don't use any for types
- Don't modify the database schema
</avoid>

Example 1: Code Review with Context

<instruction>
Review this authentication middleware for security vulnerabilities
</instruction>
<context>
This middleware is used in a financial application handling sensitive user data.
We follow OWASP Top 10 guidelines and need PCI DSS compliance.
</context>
<code_example>
async function authenticate(req, res, next) {
const token = req.headers.authorization?.split(' ')[1];
if (!token) return res.status(401).json({ error: 'No token' });
const decoded = jwt.verify(token, process.env.JWT_SECRET);
req.user = decoded;
next();
}
</code_example>
<constraints>
- Point out any security risks
- Suggest PCI DSS compliant alternatives
- Consider timing attacks and token leakage
</constraints>
<output>
Provide:
1. List of security issues found
2. Severity rating for each (Critical/High/Medium/Low)
3. Specific code fixes with examples
4. Additional security hardening recommendations
</output>

Example 2: Feature Implementation with Examples

<instruction>
Add a rate limiting system to our API endpoints
</instruction>
<context>
Current stack: Express.js + Redis
No rate limiting currently exists
Experiencing API abuse from specific IPs
</context>
<requirements>
- 100 requests per minute per IP for authenticated users
- 20 requests per minute per IP for unauthenticated
- Custom limits for premium users (stored in database)
- Return 429 status with Retry-After header
</requirements>
<code_example>
// Existing middleware pattern we use
app.use(authenticate);
app.use(authorize(['admin', 'user']));
</code_example>
<constraints>
- Must not impact existing API performance
- Redis connection should be reused
- Handle Redis connection failures gracefully
</constraints>
<output>
Provide:
1. Rate limiter middleware implementation
2. Redis configuration
3. Unit tests
4. Documentation for the team
</output>

Example 3: Bug Investigation with State

<task>
Investigate why user sessions are expiring prematurely
</task>
<problem>
Users report being logged out after 5-10 minutes of activity,
but session timeout is configured for 24 hours.
</problem>
<context>
- Next.js 14 App Router with next-auth
- PostgreSQL session store
- Load balanced across 3 servers
- Issue started after deploying v2.3.0 last week
</context>
<state>
Git diff between v2.2.0 (working) and v2.3.0 (broken) shows changes to:
- middleware.ts (session refresh logic)
- auth.config.ts (session strategy)
- database.ts (connection pooling)
</state>
<constraints>
- Don't suggest reverting the deploy
- Production issue, needs quick resolution
- Must maintain session security
</constraints>
<output>
Provide:
1. Root cause hypothesis
2. Files to investigate (in priority order)
3. Debugging commands to run
4. Potential fixes with trade-offs
</output>

Nested Tags for Complex Hierarchy:

<task>
Refactor authentication system
<subtask priority="high">
Update user model
<constraints>
- Preserve existing user IDs
- Add migration for email verification
</constraints>
</subtask>
<subtask priority="medium">
Implement OAuth providers
<requirements>
- Google and GitHub OAuth
- Reuse existing session logic
</requirements>
</subtask>
</task>

Multiple Examples with Labels:

<code_example label="current_implementation">
// Old approach with callback hell
getUser(id, (user) => {
getOrders(user.id, (orders) => {
res.json({ user, orders });
});
});
</code_example>
<code_example label="desired_pattern">
// New async/await pattern
const user = await getUser(id);
const orders = await getOrders(user.id);
res.json({ user, orders });
</code_example>

Conditional Instructions:

<instruction>
Optimize database query performance
</instruction>
<context>
Query currently takes 2.5 seconds for 10,000 records
</context>
<constraints>
<if condition="PostgreSQL">
- Use EXPLAIN ANALYZE
- Consider materialized views
</if>
<if condition="MySQL">
- Use EXPLAIN with query plan analysis
- Consider query cache
</if>
</constraints>
ScenarioRecommended?Why
Simple one-liner requests❌ NoOverhead outweighs benefit
Multi-step feature implementation✅ YesSeparates goals, constraints, examples
Bug investigation with context✅ YesDistinguishes symptoms from environment
Code review with specific criteria✅ YesClear separation of code, context, requirements
Architecture planning✅ YesOrganizes goals, constraints, trade-offs
Quick typo fix❌ NoUnnecessary complexity

Do’s:

  • ✅ Use descriptive tag names that clarify purpose
  • ✅ Keep tags consistent across similar requests
  • ✅ Combine with CLAUDE.md for project-specific tag conventions
  • ✅ Nest tags logically when representing hierarchy
  • ✅ Use tags to separate “what” from “why” from “how”

Don’ts:

  • ❌ Over-structure simple requests (adds noise)
  • ❌ Mix tag purposes (e.g., constraints inside code examples)
  • ❌ Use generic tags (<tag>, <content>) without clear meaning
  • ❌ Nest too deeply (>3 levels becomes hard to read)

You can standardize XML tag usage in your project’s CLAUDE.md:

# XML Prompt Conventions
When making complex requests, use this structure:
<instruction>Main task</instruction>
<context>
Project context and state
</context>
<code_example>
Reference implementations
</code_example>
<constraints>
Technical and business requirements
</constraints>
<output>
Expected deliverables
</output>
## Project-Specific Tags
- `<api_design>` - API endpoint design specifications
- `<accessibility>` - WCAG requirements and ARIA considerations
- `<performance>` - Performance budgets and optimization goals

XML + Plan Mode:

<instruction>Plan the migration from REST to GraphQL</instruction>
<context>
Currently 47 REST endpoints serving mobile and web clients
</context>
<constraints>
- Must maintain REST endpoints during transition (6-month overlap)
- Mobile app can't be force-updated immediately
</constraints>
<output>
Multi-phase migration plan with rollback strategy
</output>

Then use /plan to explore read-only before implementation.

XML + Cost Awareness:

For large requests, structure with XML to help Claude understand scope and estimate token usage:

<instruction>Analyze all TypeScript files for unused imports</instruction>
<scope>
src/ directory (~200 files)
</scope>
<output_format>
Summary report only (don't list every file)
</output_format>

This helps Claude optimize the analysis approach and reduce token consumption.

Create reusable templates in claudedocs/templates/:

claudedocs/templates/code-review.xml:

<instruction>
Review the following code for quality and best practices
</instruction>
<context>
[Describe the component's purpose and architecture context]
</context>
<code_example>
[Paste code here]
</code_example>
<focus_areas>
- Security vulnerabilities
- Performance bottlenecks
- Maintainability issues
- Test coverage gaps
</focus_areas>
<output>
1. Issues found (categorized by severity)
2. Specific recommendations with code examples
3. Priority order for fixes
</output>

Usage:

Terminal window
cat claudedocs/templates/code-review.xml | \
sed 's/\[Paste code here\]/'"$(cat src/auth.ts)"'/' | \
claude -p "Process this review request"

Token overhead: XML tags consume tokens. For simple requests, natural language is more efficient.

Not required: Claude understands natural language perfectly well. Use XML when structure genuinely helps.

Consistency matters: If you use XML tags, be consistent. Mixing styles within a session can confuse context.

Learning curve: Team members need to understand the tag system. Document your conventions in CLAUDE.md.

💡 Pro tip: Start with natural language prompts. Introduce XML structure when:

  • Requests have 3+ distinct aspects (instruction + context + constraints)
  • Ambiguity causes Claude to misunderstand your intent
  • Creating reusable prompt templates
  • Working with junior developers who need structured communication patterns

Source: DeepTo Claude Code Guide - XML-Structured Prompts

The Claude Code team internally treats prompts as challenges to a peer, not instructions to an assistant. This subtle shift produces higher-quality outputs because it forces Claude to prove its reasoning rather than simply comply.

Three challenge patterns from the team:

1. The Gatekeeper — Force Claude to defend its work before shipping:

"Grill me on these changes and don't make a PR until I pass your test"

Claude reviews your diff, asks pointed questions about edge cases, and only proceeds when satisfied. This catches issues that passive review misses.

2. The Proof Demand — Require evidence, not assertions:

"Prove to me this works — show me the diff in behavior between main and this branch"

Claude runs both branches, compares outputs, and presents concrete evidence. Eliminates the “trust me, it works” failure mode.

3. The Reset — After a mediocre first attempt, invoke full-context rewrite:

"Knowing everything you know now, scrap this and implement the elegant solution"

This forces a substantive second attempt with accumulated context rather than incremental patches on a weak foundation. The key insight: Claude’s second attempt with full context consistently outperforms iterative fixes.

Why this works: Provocation triggers deeper reasoning paths than polite requests. When Claude must convince rather than comply, it activates more thorough analysis and catches its own shortcuts.

Source: 10 Tips from Inside the Claude Code Team (Boris Cherny thread, Feb 2026)

LLMs are statistical pattern matchers trained on massive text corpora. Using precise technical vocabulary helps Claude activate the right patterns in its training data, leading to higher-quality outputs.

When you say “clean code”, Claude might generate any of dozens of interpretations. But when you say “SOLID principles with dependency injection following Clean Architecture layers”, you anchor Claude to a specific, well-documented pattern from its training.

Key insight: Technical terms act as GPS coordinates into Claude’s knowledge. The more precise, the better the navigation.

Vague TermSemantic AnchorWhy It Helps
”error handling""Railway Oriented Programming with Either/Result monad”Activates functional error patterns
”clean code""SOLID principles, especially SRP and DIP”Targets specific design principles
”good tests""TDD London School with outside-in approach”Specifies test methodology
”good architecture""Hexagonal Architecture (Ports & Adapters)“Names a concrete pattern
”readable code""Screaming Architecture with intention-revealing names”Triggers specific naming conventions
”scalable design""CQRS with Event Sourcing”Activates distributed patterns
”documentation""arc42 template structure”Specifies documentation framework
”requirements""EARS syntax for requirements (Easy Approach to Requirements)“Targets requirement format
”API design""REST Level 3 with HATEOAS”Specifies maturity level
”security""OWASP Top 10 mitigations”Activates security knowledge

Add semantic anchors to your project instructions:

# Architecture Principles
Follow these patterns:
- **Architecture**: Hexagonal Architecture (Ports & Adapters) with clear domain boundaries
- **Error handling**: Railway Oriented Programming - never throw, return Result<T, E>
- **Testing**: TDD London School - mock collaborators, test behaviors not implementations
- **Documentation**: ADR (Architecture Decision Records) for significant choices

Semantic anchors work powerfully with XML-structured prompts (Section 2.8):

<instruction>
Refactor the user service following Domain-Driven Design (Evans)
</instruction>
<constraints>
- Apply Hexagonal Architecture (Ports & Adapters)
- Use Repository pattern for persistence
- Implement Railway Oriented Programming for error handling
- Follow CQRS for read/write separation
</constraints>
<quality_criteria>
- Screaming Architecture: package structure reveals intent
- Single Responsibility Principle per class
- Dependency Inversion: depend on abstractions
</quality_criteria>

Testing:

  • TDD London School (mockist) vs Chicago School (classicist)
  • Property-Based Testing (QuickCheck-style)
  • Mutation Testing (PIT, Stryker)
  • BDD Gherkin syntax (Given/When/Then)

Architecture:

  • Hexagonal Architecture (Ports & Adapters)
  • Clean Architecture (Onion layers)
  • CQRS + Event Sourcing
  • C4 Model (Context, Container, Component, Code)

Design Patterns:

  • Gang of Four patterns (specify: Strategy, Factory, Observer…)
  • Domain-Driven Design tactical patterns (Aggregate, Repository, Domain Event)
  • Functional patterns (Monad, Functor, Railway)

Requirements:

  • EARS (Easy Approach to Requirements Syntax)
  • User Story Mapping (Jeff Patton)
  • Jobs-to-be-Done framework
  • BDD scenarios

💡 Pro tip: When Claude produces generic code, try adding more specific anchors. “Use clean code” → “Apply Martin Fowler’s Refactoring catalog, specifically Extract Method and Replace Conditional with Polymorphism.”

Full catalog: See examples/semantic-anchors/anchor-catalog.md for a comprehensive reference organized by domain.

Source: Concept by Alexandre Soyer. Original catalog: github.com/LLM-Coding/Semantic-Anchors (Apache-2.0)

Important: Everything you share with Claude Code is sent to Anthropic servers. Understanding this data flow is critical for protecting sensitive information.

When you use Claude Code, the following data leaves your machine:

Data TypeExampleRisk Level
Your prompts”Fix the login bug”Low
Files Claude reads.env, src/app.tsHigh if contains secrets
MCP query resultsSQL query results with user dataHigh if production data
Command outputsenv | grep API outputMedium
Error messagesStack traces with file pathsLow
ConfigurationRetentionHow to Enable
Default5 years(default state - training enabled)
Opt-out30 daysclaude.ai/settings
Enterprise (ZDR)0 daysEnterprise contract

Immediate action: Disable training data usage to reduce retention from 5 years to 30 days.

1. Block access to sensitive files in .claude/settings.json:

{
"permissions": {
"deny": [
"Read(./.env*)",
"Edit(./.env*)",
"Write(./.env*)",
"Bash(cat .env*)",
"Bash(head .env*)",
"Read(./secrets/**)",
"Read(./**/*.pem)",
"Read(./**/*.key)",
"Read(./**/credentials*)"
]
}
}

Warning: permissions.deny has known limitations. See Security Hardening Guide for details.

2. Never connect production databases to MCP servers. Use dev/staging with anonymized data.

3. Use security hooks to block reading of sensitive files (see Section 7.4).

Full guide: For complete privacy documentation including known risks, community incidents, and enterprise considerations, see Data Privacy & Retention Guide.

Reading time: 5 minutes Goal: Understand the core architecture that powers Claude Code

This section provides a summary of Claude Code’s internal mechanisms. For the complete technical deep-dive with diagrams and source citations, see the Architecture & Internals Guide.

At its core, Claude Code is a simple while loop:

┌─────────────────────────────────────────────────────────────┐
│ MASTER LOOP (simplified) │
├─────────────────────────────────────────────────────────────┤
│ │
│ Your Prompt │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Claude Reasons (no classifier, no router) │ │
│ └───────────────────────┬────────────────────────────┘ │
│ │ │
│ Tool needed? │ │
│ ┌─────┴─────┐ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ Execute Tool Text Response (done) │
│ │ │
│ └──────── Feed result back to Claude │
│ │ │
│ (loop continues) │
│ │
└─────────────────────────────────────────────────────────────┘

Source: Anthropic Engineering Blog

There is no:

  • Intent classifier or task router
  • RAG/embedding pipeline
  • DAG orchestrator
  • Planner/executor split

The model itself decides when to call tools, which tools to call, and when it’s done.

Claude Code has 8 core tools:

ToolPurpose
BashExecute shell commands (universal adapter)
ReadRead file contents (max 2000 lines)
EditModify existing files (diff-based)
WriteCreate/overwrite files
GrepSearch file contents (ripgrep-based)
GlobFind files by pattern
TaskSpawn sub-agents (isolated context)
TodoWriteTrack progress (legacy, see below)

Version: Claude Code v2.1.16+ introduced a new task management system

Claude Code provides two task management approaches:

FeatureTodoWrite (Legacy)Tasks API (v2.1.16+)
PersistenceSession memory onlyDisk storage (~/.claude/tasks/)
Multi-session❌ Lost on session end✅ Survives across sessions
Dependencies❌ Manual ordering✅ Task blocking (A blocks B)
CoordinationSingle agent✅ Multi-agent broadcast
Status trackingpending/in_progress/completedpending/in_progress/completed/failed
Description visibility✅ Always visible⚠️ TaskGet only (not in TaskList)
Metadata visibilityN/A❌ Never visible in outputs
Multi-call overheadNone⚠️ 1 + N calls for N full tasks
Enabled byAlways availableDefault since v2.1.19

Available tools:

  • TaskCreate - Initialize new tasks with hierarchy and dependencies
  • TaskUpdate - Modify task status, metadata, and dependencies
  • TaskGet - Retrieve individual task details
  • TaskList - List all tasks in current task list

Core capabilities:

  • Persistent storage: Tasks saved to ~/.claude/tasks/<task-list-id>/
  • Multi-session coordination: Share state across multiple Claude sessions
  • Dependency tracking: Tasks can block other tasks (task A blocks task B)
  • Status lifecycle: pending → in_progress → completed/failed
  • Metadata: Attach custom data (priority, estimates, related files, etc.)

Configuration:

Terminal window
# Enable multi-session task persistence
export CLAUDE_CODE_TASK_LIST_ID="project-name"
claude
# Example: Project-specific task list
export CLAUDE_CODE_TASK_LIST_ID="api-v2-auth-refactor"
claude

⚠️ Important: Use repository-specific task list IDs to avoid cross-project contamination. Tasks with the same ID are shared across all sessions using that ID.

Task schema example:

{
"id": "task-auth-login",
"title": "Implement login endpoint",
"description": "POST /auth/login with JWT token generation",
"status": "in_progress",
"dependencies": [],
"metadata": {
"priority": "high",
"estimated_duration": "2h",
"related_files": ["src/auth/login.ts", "src/middleware/auth.ts"]
}
}

When to use Tasks API:

  • Projects spanning multiple coding sessions
  • Complex task hierarchies with dependencies
  • Multi-agent coordination scenarios
  • Need to resume work after context compaction

⚠️ Tasks API Limitations (Critical)

Field visibility constraint:

ToolVisible FieldsHidden Fields
TaskListid, subject, status, owner, blockedBydescription, activeForm, metadata
TaskGetAll fields-

Impact:

  • Multi-call overhead: Reviewing 10 task descriptions = 1 TaskList + 10 TaskGet calls (11x overhead)
  • No metadata scanning: Cannot filter/sort by custom fields (priority, estimates, tags) without fetching all tasks individually
  • Session resumption friction: Cannot glance at all task notes to decide where to resume

Cost example:

Terminal window
# Inefficient (if you need descriptions)
TaskList # Returns 10 tasks (no descriptions)
TaskGet(task-1), TaskGet(task-2), ..., TaskGet(task-10) # 10 additional calls
# Total: 11 API calls to review 10 tasks

Workaround patterns:

  1. Hybrid approach (Recommended):

    • Use Tasks API for status tracking and dependency coordination
    • Maintain markdown files in repo for detailed implementation plans
    • Example: docs/plans/auth-refactor.md + Tasks for status
  2. Subject-as-summary pattern:

    • Store critical info in subject field (always visible in TaskList)
    • Keep description for deep context (fetch on-demand with TaskGet)
    • Example subjects: "[P0] Fix login bug (src/auth.ts:45)" vs "Fix bug"
  3. Selective fetching:

    • Use TaskList to identify tasks needing attention (status, blockedBy)
    • Only call TaskGet for tasks you’re actively working on

Source: Community practitioner feedback (Gang Rui, Jan 2026)

Tool: TodoWrite - Creates task lists stored in session memory

Capabilities:

  • Simple task tracking within a single session
  • Status tracking: pending/in_progress/completed
  • Lost when session ends or context is compacted

When to use TodoWrite:

  • Single-session, straightforward implementations
  • Quick fixes or exploratory coding
  • Claude Code < v2.1.16
  • Prefer simplicity over persistence

Migration flag (v2.1.19+):

Terminal window
# Temporarily revert to TodoWrite system
CLAUDE_CODE_ENABLE_TASKS=false claude
# Use new Tasks API (default)
claude

Task hierarchy design:

Project (parent)
└── Feature A (child)
├── Component A1 (leaf task)
│ ├── Implementation
│ └── Tests (depends on Implementation)
└── Component A2

Dependency management:

  • Always define dependencies when creating tasks
  • Use task IDs (not titles) for dependency references
  • Verify dependencies with TaskGet before execution

Status transitions:

  • Mark in_progress when starting work (prevents parallel execution)
  • Update frequently for visibility
  • Only mark completed when fully accomplished (tests passing, validated)
  • Use failed status with error metadata for debugging

Metadata conventions:

{
"priority": "high|medium|low",
"estimated_duration": "2h",
"related_files": ["path/to/file.ts"],
"related_issue": "https://github.com/org/repo/issues/123",
"type": "feature|bugfix|refactor|test"
}

The Diagnostic Principle: When Claude’s task list doesn’t match your intent, the problem isn’t Claude—it’s your instructions.

Task lists act as a mirror for instruction clarity. If you ask Claude to plan a feature and the resulting tasks surprise you, that divergence is diagnostic information:

Your instruction: "Refactor the auth system"
Claude's task list:
- [ ] Read all auth-related files
- [ ] Identify code duplication
- [ ] Extract shared utilities
- [ ] Update imports
- [ ] Run tests
Your reaction: "That's not what I meant—I wanted to switch from session to JWT"
Diagnosis: Your instruction was ambiguous. "Refactor" ≠ "replace".

Divergence patterns and what they reveal:

Divergence TypeWhat It MeansFix
Tasks too broadInstructions lack specificityAdd WHAT, WHERE, HOW, VERIFY
Tasks too narrowInstructions too detailed, missing big pictureState the goal, not just the steps
Wrong prioritiesContext missing about what mattersAdd constraints and priorities
Missing tasksImplicit knowledge not sharedMake assumptions explicit in prompt
Extra tasksClaude inferred requirements you didn’t intendAdd explicit scope boundaries

Using task divergence as a workflow:

## Step 1: Seed with loose instruction
User: "Improve the checkout flow"
## Step 2: Review Claude's task list (don't execute yet)
Claude generates: [task list]
## Step 3: Compare against your mental model
- Missing: payment retry logic? → Add to instructions
- Unexpected: UI redesign? → Clarify scope (backend only)
- Wrong order: tests last? → Specify TDD approach
## Step 4: Refine and re-plan
User: "Actually, here's what I need: [refined instruction with specifics]"

Pro tip: Run TaskList after initial planning as a sanity check before execution. If more than 30% of tasks surprise you, your prompt needs work. Iterate on the prompt, not the tasks.

→ See: Task Management Workflow for:

  • Task planning phase (decomposition, hierarchy design)
  • Task execution patterns
  • Session management and resumption
  • Integration with TDD and Plan-Driven workflows
  • TodoWrite migration guide
  • Patterns, anti-patterns, and troubleshooting

Claude Code operates within a 200K token context window (1M beta available via API — see [200K vs 1M comparison](line 1751)):

ComponentApproximate Size
System prompt5-15K tokens
CLAUDE.md files1-10K tokens
Conversation historyVariable
Tool resultsVariable
Reserved for response40-45K tokens

When context fills up (~75% in VS Code, ~95% in CLI), older content is automatically summarized. However, research shows this degrades quality (50-70% performance drop on complex tasks). Use /compact proactively at logical breakpoints, or trigger session handoffs at 85% to preserve intent over compressed history. See [Session Handoffs](line 2140) and Auto-Compaction Research.

The Task tool spawns sub-agents with:

  • Their own fresh context window
  • Access to the same tools (except Task itself)
  • Maximum depth of 1 (cannot spawn sub-sub-agents)
  • Only their summary text returns to the main context

This prevents context pollution during exploratory tasks.

Status: Partially feature-flagged, progressive rollout in progress.

TeammateTool enables multi-agent orchestration with persistent communication between agents. Unlike standard sub-agents that work in isolation, teammates can coordinate through structured messaging.

Core Capabilities:

OperationPurpose
spawnTeamCreate a named team of agents
discoverTeamsList available teams
requestJoinAgent requests to join a team
approveJoinTeam leader approves join requests
MessagingJSON-based inter-agent communication

Execution Backends (auto-detected):

  • In-process: Async tasks in same Node.js process (fastest)
  • tmux: Persistent terminal sessions (survives disconnects)
  • iTerm2: Visual split panes (macOS only)

Patterns:

Parallel Specialists Pattern:
Leader spawns 3 teammates → Each reviews different aspect (security, perf, architecture)
→ Teammates work concurrently → Report back to leader → Leader synthesizes
Swarm Pattern:
Leader creates shared task queue → Teammates self-organize and claim tasks
→ Independent execution → Async updates to shared state

Limitations:

  • 5-minute heartbeat timeout before auto-removal
  • Cannot cleanup teams while teammates are active
  • Feature flags not officially documented (community-discovered)
  • No official Anthropic support for experimental features

When to Use:

  • Large codebases requiring parallel analysis (4+ aspects)
  • Long-running workflows with independent sub-tasks
  • Code reviews with multiple specialized concerns

When NOT to Use:

  • Simple tasks (overhead not justified)
  • Sequential dependencies (standard sub-agents sufficient)
  • Production-critical workflows (experimental = unstable)

Sources:

⚠️ Note: This is an experimental feature. Capabilities may change or be removed in future releases. Always verify current behavior with official documentation.

Agent Anti-Patterns: Roles vs Context Control

Section titled “Agent Anti-Patterns: Roles vs Context Control”

“Subagents are not for anthropomorphizing roles, they are for controlling context” - Dex Horty

Common Mistake: Creating agents as if building a human team with job titles.

Wrong (Anthropomorphizing):

- Frontend Agent (role: UI developer)
- Backend Agent (role: API engineer)
- QA Agent (role: tester)
- Security Agent (role: security expert)

Why this fails: Agents aren’t humans with expertise areas. They’re context isolation tools for computational efficiency.

Right (Context Control):

- Agent for isolated dependency analysis (scope: package.json + lock files only)
- Agent for parallel file processing (scope: batch edits without main context pollution)
- Agent for fresh security audit (scope: security-focused analysis without prior assumptions)
- Agent for independent module testing (scope: test execution without interfering with main workflow)

Key differences:

Anthropomorphizing (Wrong)Context Control (Right)
“Security expert agent""Security audit with isolated context"
"Frontend developer agent""UI component analysis (scope: src/components/ only)"
"Code reviewer agent""PR review without main context pollution”
Mimics human team structureOptimizes computational resources
Based on job rolesBased on scope/context boundaries

When to use agents (good reasons):

  • Isolate context: Prevent pollution of main conversation context
  • Parallel processing: Independent operations that can run concurrently
  • Scope limitation: Restrict analysis to specific files/directories
  • Fresh perspective: Analyze without baggage from previous reasoning
  • Resource optimization: Offload heavy operations to separate context window

When NOT to use agents (bad reasons):

  • ❌ Creating a fake team with job titles
  • ❌ Roleplaying different “expertise” personas
  • ❌ Mimicking human organizational structure
  • ❌ Splitting work by discipline (frontend/backend/QA) instead of by context boundaries

Beyond generic sub-agents, scope-focused orchestration assigns distinct context boundaries to different agents for multi-perspective analysis.

The Pattern: Instead of one agent reviewing everything, spawn scope-isolated agents that each analyze distinct aspects with fresh context:

User: Review the new payment service using scope-focused analysis:
Agent 1 (Security Scope): Analyze authentication, input validation,
injection vectors, secret handling, PCI DSS compliance.
Context: src/payment/, src/auth/, config/security.yml
Agent 2 (Performance Scope): Analyze database queries, N+1 problems,
caching opportunities, response time bottlenecks.
Context: src/payment/repository/, src/database/, slow query logs
Agent 3 (API Design Scope): Analyze error messages, response format
consistency, API discoverability, documentation completeness.
Context: src/payment/api/, docs/api/, tests/integration/
Synthesize all three scoped analyses into a unified review with
prioritized action items.

Implementation with Custom Agents:

.claude/agents/security-audit.md
---
name: security-audit
model: opus
tools: Read, Grep, Glob
---
Analyze code for security issues with isolated context:
- OWASP Top 10 vulnerabilities
- Authentication/authorization flaws
- Input validation gaps
- Secret exposure risks
Scope: Security-focused analysis only. Report findings with severity
ratings (Critical/High/Medium/Low) without considering performance
or UX trade-offs.
.claude/agents/perf-audit.md
---
name: perf-audit
model: sonnet
tools: Read, Grep, Glob, Bash
---
Analyze code for performance bottlenecks with isolated context:
- Database query efficiency (N+1, missing indexes)
- Memory leaks and resource management
- Caching opportunities
- Algorithmic complexity issues
Scope: Performance-focused analysis only. Report findings with estimated
impact (High/Medium/Low) without considering security or maintainability
trade-offs.

When to use scope-focused agents:

  • Analysis requiring 3+ distinct context boundaries (security scope, perf scope, API scope)
  • Competing concerns that benefit from isolated evaluation (performance vs. security vs. DX)
  • Large codebases where full context would pollute analysis of specific aspects

When NOT to use scope-focused agents:

  • Simple reviews (one agent with full context covers all aspects)
  • Time-constrained situations (overhead of synthesis outweighs benefit)
  • Tasks where scopes aren’t genuinely independent (overlapping context needed)

“Do more with less. Smart architecture choices, better training efficiency, and focused problem-solving can compete with raw scale.” — Daniela Amodei, Anthropic CEO

Claude Code trusts the model’s reasoning instead of building complex orchestration systems. This means:

  • Fewer components = fewer failure modes
  • Model-driven decisions = better generalization
  • Simple loop = easy debugging
TopicWhere
Full architecture detailsArchitecture & Internals Guide
Permission systemSection 7 - Hooks
MCP integrationSection 8.6 - MCP Security
Context management tipsSection 2.2

Quick jump: Memory Files (CLAUDE.md) · .claude/ Folder Structure · Settings & Permissions · Precedence Rules