Skip to content
Code Guide

How Claude Code Works: Architecture & Internals

How Claude Code Works: Architecture & Internals

Section titled “How Claude Code Works: Architecture & Internals”

A technical deep-dive into Claude Code’s internal mechanisms, based on official Anthropic documentation and verified community analysis.

Author: Florian BRUNIAUX | Contributions from Claude (Anthropic)

Reading time: ~25 minutes (full) | ~5 minutes (TL;DR only)

Last verified: February 2026 (Claude Code v2.1.34)


This document combines three tiers of sources:

TierDescriptionConfidenceExample
Tier 1Official Anthropic documentation100%anthropic.com/engineering/*
Tier 2Verified reverse-engineering70-90%PromptLayer analysis, code.claude.com behavior
Tier 3Community inference40-70%Observed but not officially confirmed

Each claim is marked with its confidence level. Always prefer official documentation when available.


  1. Simple Loop: Claude Code runs a while(tool_call) loop — no DAGs, no classifiers, no RAG. The model decides everything.

  2. Eight Core Tools: Bash (universal adapter), Read, Edit, Write, Grep, Glob, Task (sub-agents), TodoWrite. That’s the entire arsenal.

    Search Strategy Evolution: Early Claude Code versions experimented with RAG using Voyage embeddings for semantic code search. Anthropic switched to grep-based (ripgrep) agentic search after internal benchmarks showed superior performance with lower operational complexity — no index sync required, no security liabilities from external embedding providers. This “Search, Don’t Index” philosophy trades latency/tokens for simplicity/security. Community plugins (ast-grep for AST patterns) and MCP servers (Serena for symbols, grepai for RAG) available for specialized needs.

    Source: Latent Space podcast (May 2025), ast-grep documentation

  3. 200K Token Budget: Context window shared between system prompt, history, tool results, and response buffer. Auto-compacts at ~75-92% capacity.

  4. Sub-agents = Isolation: The Task tool spawns sub-agents with their own context. They cannot spawn more sub-agents (depth=1). Only their summary returns.

  5. Philosophy: “Less scaffolding, more model” — trust Claude’s reasoning instead of building complex orchestration systems around it.


Before diving into the technical details, this diagram by Mohamed Ali Ben Salem captures the essential architecture:

Claude Code Architecture Overview

Source: Mohamed Ali Ben Salem on LinkedIn — Used with attribution

Key insight: Claude Code is NOT a new AI model — it’s an orchestration layer that connects Claude (Opus/Sonnet/Haiku) to your development environment through file editing, command execution, and repository navigation.


  1. The Master Loop
  2. The Tool Arsenal
  3. Context Management Internals
  4. Sub-Agent Architecture
  5. Permission & Security Model
  6. MCP Integration
  7. The Edit Tool: How It Actually Works
  8. Session Persistence
  9. Philosophy: Less Scaffolding, More Model
  10. Claude Code vs Alternatives
  11. Sources & References
  12. Appendix: What We Don’t Know

Confidence: 100% (Tier 1 - Official) Source: Anthropic Engineering Blog

At its core, Claude Code is remarkably simple:

┌─────────────────────────────────────────────────────────────┐
│ CLAUDE CODE MASTER LOOP │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ │
│ │ Your Prompt │ │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ CLAUDE REASONS │ │
│ │ (No classifier, no routing layer) │ │
│ │ │ │
│ └────────────────────────┬─────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────┐ │
│ │ Tool Call? │ │
│ └───────┬────────┘ │
│ │ │
│ YES │ NO │
│ ┌─────────────────┴─────────────────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌────────────┐ ┌────────────┐ │
│ │ Execute │ │ Text │ │
│ │ Tool │ │ Response │ │
│ │ │ │ (DONE) │ │
│ └─────┬──────┘ └────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Feed Result │ │
│ │ to Claude │──────────────────┐ │
│ └─────────────┘ │ │
│ │ │
│ ▼ │
│ ┌────────────────┐ │
│ │ LOOP BACK │ │
│ │ (Next turn) │ │
│ └────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘

The entire architecture is a simple while loop:

while (claude_response.has_tool_call):
result = execute_tool(tool_call)
claude_response = send_to_claude(result)
return claude_response.text

There is no:

  • Intent classifier
  • Task router
  • RAG/embedding pipeline
  • DAG orchestrator
  • Planner/executor split

The model itself decides when to call tools, which tools to call, and when it’s done. This is the “agentic loop” pattern described in Anthropic’s engineering blog.

  1. Simplicity: Fewer components = fewer failure modes
  2. Model-driven: Claude’s reasoning is better than hand-coded heuristics
  3. Flexibility: No rigid pipeline constraining what Claude can do
  4. Debuggability: Easy to understand what happened and why

Use this checklist to verify you understand Claude Code’s full surface area. Each capability is documented in detail elsewhere in this guide.

The 11 Native Capabilities:

  • Event Hooks — Bash/PowerShell scripts triggered on tool execution

    • PreToolUse, PostToolUse, UserPromptSubmit, Notification
    • See: Section 5 Hooks
  • Skill-Scoped Hooks — Event hooks specific to skill execution context

  • Background Agents — Async task execution (test suites, long operations)

  • Explore Subagent/explore for codebase analysis

  • Plan Subagent/plan for read-only planning mode

  • Task Tool — Hierarchical task delegation to specialized agents

  • Agent Teams — Multi-agent parallel coordination (experimental v2.1.32+)

  • Per-Task Model Selection — Dynamic model switching mid-session

  • MCP Protocol Integration — Model Context Protocol for tool extensions

  • Permission Modes — Fine-grained control over tool execution

  • Session Memory — Persistent context across sessions

Onboarding Tip: If you haven’t explored all 11 capabilities, you’re likely missing productivity opportunities. Focus on the unchecked items above.

Source: Synthesized from Gur Sannikov analysis


Confidence: 100% (Tier 1 - Official) Source: code.claude.com/docs

Claude Code has exactly 8 core tools:

ToolPurposeKey BehaviorToken Cost
BashExecute shell commandsUniversal adapter, most powerfulLow (command) + Variable (output)
ReadRead file contentsMax 2000 lines, handles truncationHigh for large files
EditModify existing filesDiff-based, requires exact matchMedium
WriteCreate/overwrite filesMust read first if file existsMedium
GrepSearch file contentsRipgrep-based (regex), replaced RAG/embedding approach. For structural code search (AST-based), see ast-grep plugin. Trade-off: Grep (fast, simple) vs ast-grep (precise, setup required) vs Serena MCP (semantic, symbol-aware)Low
GlobFind files by patternPath matching, sorted by mtimeLow
TaskSpawn sub-agentsIsolated context, depth=1 limitHigh (new context)
TodoWriteTrack progressStructured task managementLow

Key insight: Bash is Claude’s swiss-army knife. It can:

  • Run any CLI tool (git, npm, docker, curl…)
  • Execute scripts
  • Chain commands with pipes
  • Access system state

The model has been trained on massive amounts of shell data, making it highly effective at using Bash as a universal adapter when specialized tools aren’t enough.

Claude decides which tool to use based on the task. There’s no hardcoded routing:

┌─────────────────────────────────────────────────────┐
│ TOOL SELECTION (Model-Driven) │
├─────────────────────────────────────────────────────┤
│ │
│ "Read auth.ts" → Read tool │
│ "Find all test files" → Glob tool │
│ "Search for TODO" → Grep tool │
│ "Run npm test" → Bash tool │
│ "Explore the codebase" → Task tool (sub-agent) │
│ "Track my progress" → TodoWrite tool │
│ │
│ The model learns these patterns during training, │
│ not from explicit rules. │
│ │
└─────────────────────────────────────────────────────┘

Beyond the 8 core tools, Claude Code can leverage:

MCP Servers (Model Context Protocol):

  • Serena: Symbol-aware code navigation + session memory
  • grepai: Semantic search + call graph analysis (Ollama-based)
  • Context7: Official library documentation lookup
  • Sequential: Structured multi-step reasoning
  • Playwright: Browser automation and E2E testing
  • claude-code-ultimate-guide: 12 tools — guide search, release tracking, compare_versions, security threat lookup (get_threat, list_threats with 28 CVEs + 655 malicious skills), template search (search_examples) — npx -y claude-code-ultimate-guide-mcp

Community Plugins:

  • ast-grep: AST-based structural code search (explicit invocation)

Claude Code offers multiple ways to search code, each with specific strengths:

Search NeedNative ToolMCP/Plugin AlternativeWhen to Escalate
Exact textGrep (ripgrep)-Never (fastest)
Function nameGrepSerena find_symbolMulti-file refactoring
By meaning-grepai searchDon’t know exact text
Call graph-grepai trace_callersDependency analysis
Structural pattern-ast-grepLarge migrations (>50k lines)
File structure-Serena get_symbols_overviewNeed symbol context

Performance Comparison:

ToolSpeedSetupUse Case
Grep (ripgrep)⚡ ~20ms✅ None90% of searches
Serena⚡ ~100ms⚠️ MCPRefactoring, symbols
grepai🐢 ~500ms⚠️ Ollama + MCPSemantic, call graph
ast-grep🕐 ~200ms⚠️ PluginAST patterns, migrations

Decision principle: Start with Grep (fastest), escalate to specialized tools only when needed.

📖 Deep Dive: See Search Tools Mastery for comprehensive workflows combining all search tools.


Confidence: 80% (Tier 2 - Partially Official) Sources:

Claude Code operates within a fixed context window (~200K tokens, varies by model).

┌─────────────────────────────────────────────────────────────┐
│ CONTEXT BUDGET (~200K tokens) │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ System Prompt (~5-15K) │ │
│ │ • Tool definitions │ │
│ │ • Safety instructions │ │
│ │ • Behavioral guidelines │ │
│ │ • See detailed breakdown below ↓ │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ CLAUDE.md Files (~1-10K) │ │
│ │ • Global ~/.claude/CLAUDE.md │ │
│ │ • Project /CLAUDE.md │ │
│ │ • Local /.claude/CLAUDE.md │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ Conversation History (variable) │ │
│ │ • Your prompts │ │
│ │ • Claude's responses │ │
│ │ • Tool call records │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ Tool Results (variable) │ │
│ │ • File contents from Read │ │
│ │ • Command outputs from Bash │ │
│ │ • Search results from Grep │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ Reserved for Response (~40-45K) │ │
│ │ • Claude's thinking │ │
│ │ • Generated code/text │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ USABLE = Total - System - Reserved ≈ 140-150K tokens │
│ │
└─────────────────────────────────────────────────────────────┘

Confidence: 100% (Tier 1 - Official Anthropic Documentation) Sources:

Claude system prompts (~5-15K tokens) are publicly published by Anthropic as part of their transparency commitment. These prompts define:

Core Components:

  • Tool definitions: Bash, Read, Edit, Write, Grep, Glob, Task, TodoWrite
  • Safety instructions: Content policies, refusal patterns (see Security Hardening)
  • Behavioral guidelines: Task-first approach, MVP-first, no over-engineering
  • Context instructions: How to gather and use project context

Important Distinctions:

  • Claude.ai/Mobile: Published prompts available publicly
  • Anthropic API: Different default instructions, configurable by developers
  • Claude Code CLI: Agentic coding assistant with context-gathering behavior

Community Analysis (for deeper understanding):

Cross-reference: For security implications, see Section 5: Permission & Security Model

Note: Claude Code system prompts may differ from Claude.ai/mobile versions. The above sources cover the Claude family; Code-specific prompts are integrated into the CLI tool’s behavior.


Confidence: 75% (Tier 2 - Community-verified with research backing)

When context usage exceeds a threshold, Claude Code automatically summarizes older conversation turns:

SourceReported ThresholdNotes
VS Code extension~75% usage (25% remaining)GitHub #11819 (Nov 2025)
CLI version1-5% remainingMore conservative than VS Code
PromptLayer analysis92%Historical observation
Steve Kinney95%Session Management Guide (Jul 2025)
User-triggered /compactAnytimeManual control

What happens during compaction:

  1. Older conversation turns are summarized
  2. Tool results are condensed
  3. Recent context is preserved in full
  4. The model receives a “context was compacted” signal

Performance Impact (Research-backed):

Recent research and practitioner observations confirm quality degradation with auto-compaction:

  • LLM performance drops 50-70% on complex tasks as context grows from 1K to 32K tokens (Context Rot Research, Jul 2025)
  • 11 out of 12 models fall below 50% of their short-context performance at 32K tokens (NoLiMa benchmark)
  • Auto-compact loses nuance and breaks references through repeated compression cycles (Claude Saves Tokens, Forgets Everything, Jan 2026)
  • Attention mechanism struggles with retrieval burden in high-context scenarios

Community Consensus: Manual /compact at logical breakpoints > waiting for auto-compact to trigger.

Recommended Strategy (Lorenz, 2026):

Context %ActionRationale
70%Warning - Plan cleanupEarly awareness
85%Manual handoff recommendedPrevent auto-compact degradation
95%Force handoffSevere quality degradation

User control: Use /compact manually to trigger summarization at logical breakpoints, or use session handoffs (see Session Handoffs) to preserve intent over compressed history.

StrategyWhen to UseHow
Sub-agentsExploratory tasksTask tool for isolated search
Manual compactProactive cleanup/compact command
Clear sessionFresh start needed/clear command
Specific readsKnow what you needRead exact files, not directories
CLAUDE.mdPersistent contextStore conventions in memory files

Confidence: 70% (Tier 2 - Practitioner studies, arXiv research)

Claude Code’s effectiveness degrades predictably under certain conditions:

ConditionObserved ThresholdSymptom
Conversation turns15-25 turnsLoses track of earlier constraints
Token accumulation80-100K tokensIgnores requirements stated early in session
Problem scope>5 files simultaneouslyInconsistent changes, missed files

Success rates by scope (from practitioner studies):

ScopeSuccess RateExample
1-3 files~85%Fix bug in single module
4-7 files~60%Refactor feature across components
8+ files~40%Codebase-wide changes

Mitigation strategies:

  1. Checkpoint prompts: “Before continuing, recap the current requirements and constraints.”
  2. Session resets: Start fresh for new tasks (/clear)
  3. Scope tightly: Break large tasks into focused sub-tasks
  4. Use sub-agents: Delegate exploration to Task tool to preserve main context

Confidence: 100% (Tier 1 - Documented behavior) Source: code.claude.com/docs + System prompt (visible in tool definitions)

The Task tool spawns sub-agents for parallel or isolated work.

┌─────────────────────────────────────────────────────────────┐
│ MAIN AGENT │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ Context: Full conversation + all file reads │ │
│ │ │ │
│ │ Task("Explore authentication patterns") │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────┐ │ │
│ │ │ SUB-AGENT (Spawned) │ │ │
│ │ │ │ │ │
│ │ │ • Own fresh context window │ │ │
│ │ │ • Receives: task description only │ │ │
│ │ │ • Has access to: same tools (except Task) │ │ │
│ │ │ • CANNOT spawn sub-sub-agents (depth = 1) │ │ │
│ │ │ • Returns: summary text only │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Result: "Found 3 auth patterns: JWT in..." │ │
│ │ (Only this text enters main context) │ │
│ │ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘

Limiting sub-agents to one level prevents:

  1. Recursive explosion: Agent-ception would consume infinite resources
  2. Context pollution: Each level would accumulate context
  3. Debugging nightmares: Tracking multi-level agent chains is hard
  4. Unpredictable costs: Nested agents = unpredictable token usage

Claude Code offers specialized sub-agent types via the subagent_type parameter:

TypePurposeTools Available
ExploreCodebase explorationAll read-only tools
PlanArchitecture planningAll except Edit/Write
BashCommand executionBash only
general-purposeComplex multi-stepAll tools
Use CaseWhy Sub-Agent Helps
Searching large codebasesKeeps main context clean
Parallel explorationMultiple searches simultaneously
Risky explorationErrors don’t pollute main context
Specialized analysisDifferent “mindset” for different tasks

Confidence: 100% (Tier 1 - Official) Sources:

Claude Code has a layered security model:

┌─────────────────────────────────────────────────────────────┐
│ PERMISSION LAYERS │
├─────────────────────────────────────────────────────────────┤
│ │
│ Layer 1: INTERACTIVE PROMPTS │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Claude wants to run: rm -rf node_modules │ │
│ │ [Allow once] [Allow always] [Deny] [Edit command] │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Layer 2: ALLOW/DENY RULES (settings.json) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ { │ │
│ │ "permissions": { │ │
│ │ "allow": ["Bash(npm *)", "Read"], │ │
│ │ "deny": ["Bash(rm -rf *)"] │ │
│ │ } │ │
│ │ } │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Layer 3: HOOKS (Pre/Post execution) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ PreToolUse: Validate before execution │ │
│ │ PostToolUse: Audit after execution │ │
│ │ PermissionRequest: Override permission prompts │ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Layer 4: SANDBOX MODE (Optional isolation) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Filesystem isolation + Network restrictions │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘

Confidence: 80% (Tier 2 - Observed but not exhaustive)

Claude Code appears to flag certain patterns for extra scrutiny:

PatternRiskBehavior
rm -rfDestructive deletionAlways prompts
sudoPrivilege escalationAlways prompts
curl | shRemote code executionAlways prompts
chmod 777Insecure permissionsAlways prompts
git push --forceHistory destructionAlways prompts
DROP TABLEData destructionAlways prompts

This is not a complete blocklist — patterns are likely detected through model training rather than explicit rules.

Confidence: 100% (Tier 1 - Official) Source: code.claude.com/docs/en/sandboxing

Claude Code includes built-in native sandboxing using OS-level primitives for process-level isolation:

┌──────────────────────────────────────────────────────┐
│ Native Sandbox Architecture │
├──────────────────────────────────────────────────────┤
│ │
│ Bash Command Request │
│ │ │
│ ▼ │
│ Sandbox Wrapper (Seatbelt/bubblewrap) │
│ │ │
│ ├─ Filesystem: read all, write CWD only │
│ ├─ Network: SOCKS5 proxy + domain filtering │
│ ├─ Process: isolated environment │
│ │ │
│ ▼ │
│ OS Kernel Enforcement │
│ │ │
│ ├─ Allowed: operations within boundaries │
│ ├─ Blocked: violations at system call level │
│ └─ Notify: user receives alert on violation │
│ │
└──────────────────────────────────────────────────────┘

OS Primitives:

PlatformMechanismNotes
macOSSeatbelt (TrustedBSD MAC)Built-in, kernel-level system call filtering
Linux/WSL2bubblewrap (namespaces + seccomp)Requires: sudo apt-get install bubblewrap socat
WSL1❌ Not supportedbubblewrap needs kernel features unavailable
Windows⏳ PlannedNot yet available

Isolation Model:

  1. Filesystem:

    • Read: Entire computer (except denied paths)
    • Write: Current working directory only (configurable)
    • Blocked: Modifications outside CWD, credentials directories (~/.ssh, ~/.aws)
  2. Network:

    • Proxy: All connections routed through SOCKS5 proxy
    • Domain filtering: Allowlist/denylist mode
    • Default blocked: Private CIDRs, localhost ranges
  3. Process:

    • Shared kernel: Vulnerable to kernel exploits (unlike Docker microVM)
    • Child processes: Inherit same sandbox restrictions
    • Escape hatch: dangerouslyDisableSandbox parameter for incompatible tools

Sandbox Modes:

  • Auto-allow mode: Bash commands auto-approved if sandboxed (recommended for daily dev)
  • Regular permissions mode: All commands require explicit approval (high-security)

Security Trade-offs:

AspectNative SandboxDocker Sandboxes (microVM)
Kernel isolation❌ Shared kernel✅ Separate kernel per VM
Setup0 deps (macOS), 2 pkgs (Linux)Docker Desktop 4.58+
OverheadMinimal (~1-3% CPU)Moderate (~5-10% CPU)
Use caseDaily dev, trusted codeUntrusted code, max security

Security Limitations:

⚠️ Domain fronting: CDNs (Cloudflare, Akamai) can bypass domain filtering ⚠️ Unix sockets: Misconfigured allowUnixSockets grants privilege escalation ⚠️ Filesystem: Overly broad write permissions enable attacks on $PATH directories

When to use:

  • Native Sandbox: Daily development, trusted team, lightweight setup
  • Docker Sandboxes: Untrusted code, kernel exploit protection, Docker-in-Docker needed

Deep dive: See Native Sandboxing Guide for complete technical reference, configuration examples, and troubleshooting.

Hooks allow programmatic control over Claude’s actions:

{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [{
"type": "command",
"command": "/path/to/validate-command.sh"
}]
}
],
"PostToolUse": [
{
"matcher": "*",
"hooks": [{
"type": "command",
"command": "/path/to/audit-log.sh"
}]
}
]
}
}

Hook capabilities:

CapabilitySupportedHow
Block executionYesExit code 2
Modify parametersYesReturn modified JSON
Log actionsYesWrite to file in hook
Async processingYesSet async: true in hook config (v2.1.0+)

Hook JSON payload (passed via stdin):

{
"session_id": "abc123",
"transcript_path": "/home/user/.claude/projects/.../transcript.jsonl",
"cwd": "/path/to/project",
"permission_mode": "default",
"hook_event_name": "PreToolUse",
"tool_name": "Bash",
"tool_input": {
"command": "npm install lodash"
}
}

Common fields sent to all events: session_id, transcript_path, cwd, permission_mode, hook_event_name. Event-specific fields (e.g., tool_name/tool_input for PreToolUse) are added on top.

Cross-reference: See Section 7 - Hooks in the main guide for complete examples.


Confidence: 100% (Tier 1 - Official) Source: code.claude.com/docs/en/mcp

MCP (Model Context Protocol) servers extend Claude Code with additional tools.

💡 Visual Guide: The following diagram illustrates how MCP creates a secure control layer between LLMs and real systems. The LLM layer has no direct data access - the MCP Server enforces security policies before tools can interact with databases, APIs, or files.

MCP Architecture - 7-Layer Security Model

Figure 1: MCP Architecture showing separation between thinking (LLM), control (MCP Server), and execution (Tools). Design inspired by Dinesh Kumar’s LinkedIn visualization, recreated under Apache-2.0 license.

Key security boundaries:

  • Yellow layer (LLM): Reasoning only - No Data Access
  • Orange layer (MCP Server): Security control point (policies, validation, logs)
  • Grey layer (Real Systems): Protected data - Hidden From AI
┌─────────────────────────────────────────────────────────────┐
│ MCP INTEGRATION │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ CLAUDE CODE │ │
│ │ │ │
│ │ Native Tools MCP Tools │ │
│ │ ┌─────────┐ ┌─────────────────────────┐ │ │
│ │ │ Bash │ │ mcp__serena__* │ │ │
│ │ │ Read │ │ mcp__context7__* │ │ │
│ │ │ Edit │ │ mcp__playwright__* │ │ │
│ │ │ ... │ │ mcp__custom__* │ │ │
│ │ └─────────┘ └───────────┬─────────────┘ │ │
│ │ │ │ │
│ └──────────────────────────────────┼──────────────────┘ │
│ │ │
│ JSON-RPC 2.0 │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ MCP SERVER │ │
│ │ │ │
│ │ stdio/HTTP transport │ │
│ │ Tool definitions (JSON Schema) │ │
│ │ Tool implementations │ │
│ │ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
AspectBehavior
ProtocolJSON-RPC 2.0 over stdio or HTTP
Tool namingmcp__<server>__<tool> convention
Context sharingOnly via tool parameters and return values
LifecycleServer starts on first use, stays alive during session
PermissionsSame system as native tools
LimitationExplanation
Access conversation historyOnly sees tool params, not full context
Maintain state across callsEach call is independent (unless server implements caching)
Modify Claude’s system promptTools only, no prompt injection
Bypass permissionsSame security layer as native tools

Cross-reference: See Section 8.6 - MCP Security for security considerations.

Status: Stable (January 26, 2026) Spec: SEP-1865 on GitHub Co-authored by: OpenAI, Anthropic, MCP-UI creators

MCP Apps is the first official extension to the Model Context Protocol, enabling MCP servers to deliver interactive user interfaces alongside traditional tool responses.

The problem solved: Traditional text-based responses create friction for workflows requiring exploration. Each interaction (sort, filter, drill-down) demands a new prompt cycle. MCP Apps eliminates this “context gap” by rendering interactive UIs directly in the conversation.

Two core primitives:

  1. Tools with UI metadata:

    {
    "name": "query_database",
    "description": "Query customer database",
    "_meta": {
    "ui": {
    "resourceUri": "ui://dashboard/customers"
    }
    }
    }
  2. UI Resources (ui:// scheme):

    • Server-side HTML/JavaScript bundles
    • Rendered in sandboxed iframes by host
    • Bidirectional JSON-RPC communication via postMessage

Communication flow:

┌─────────────────────────────────────────────────────────┐
│ MCP APPS ARCHITECTURE │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ MCP Client │◄───────►│ MCP Server │ │
│ │ (Claude/IDE) │ JSON-RPC│ (Your App) │ │
│ └──────┬───────┘ └──────────────┘ │
│ │ │
│ │ Fetches ui:// resource │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Sandboxed Iframe (UI Render) │ │
│ │ ┌───────────────────────────────────┐ │ │
│ │ │ HTML/JS Bundle from Server │ │ │
│ │ │ - Interactive dashboard │ │ │
│ │ │ - Forms with validation │ │ │
│ │ │ - Real-time data visualization │ │ │
│ │ └───────────────────────────────────┘ │ │
│ │ │ │
│ │ postMessage ◄─────► JSON-RPC │ │
│ └─────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘

Multi-layered protection:

LayerProtection
Iframe sandboxRestricted permissions (no direct system access)
Pre-declared templatesHosts review HTML/JS before rendering
Auditable messagingAll UI-to-host communication via JSON-RPC logs
User consentOptional requirement for UI-initiated tool calls
Content blockingHosts can reject suspicious resources pre-render

Cross-reference: See Section 8.6 - MCP Security for broader MCP security considerations.

Installation:

Terminal window
npm install @modelcontextprotocol/ext-apps

Core API (framework-agnostic):

import { App } from '@modelcontextprotocol/ext-apps';
const app = new App();
// 1. Establish communication with host
await app.connect();
// 2. Receive tool results from host
app.ontoolresult = (result) => {
// Update UI with tool execution results
updateDashboard(result.data);
};
// 3. Call server tools from UI
await app.callServerTool('fetch_analytics', {
timeRange: '7d',
metrics: ['users', 'revenue']
});
// 4. Update model context asynchronously
await app.updateModelContext({
selectedFilters: ['region:EU', 'status:active']
});
// Additional capabilities:
app.logDebug('User action', { filter: 'applied' });
app.openBrowserLink('https://docs.example.com');
app.sendFollowUpMessage('Applied filters: EU, Active');

Standard communication: All features operate over postMessage (no framework lock-in).

PlatformMCP Apps SupportNotes
Claude Desktop✅ Available nowclaude.ai/directory (Pro/Max/Team/Enterprise)
Claude Cowork🔄 Coming soonAgentic workflow integration planned
VS Code✅ Insiders buildOfficial blog post
ChatGPT🔄 Rolling outWeek of Jan 26, 2026
Goose✅ Available nowOpen-source CLI with UI support
Claude Code CLI❌ N/ATerminal text-only (no iframe rendering)

Direct usage: None (CLI is text-only, cannot render iframes)

Indirect benefits:

  1. Ecosystem understanding: MCP Apps represents the future of agentic workflows
  2. MCP server development: If building custom MCP servers, Apps is now a design option
  3. Hybrid workflows:
    • Use Claude Desktop to explore data with Apps (dashboards, visualizations)
    • Switch to Claude Code CLI for implementation (scripting, automation)
  4. Context for configuration: MCP servers may advertise UI capabilities in metadata

Official example servers (in ext-apps repository):

  • threejs-server: 3D visualization and manipulation
  • map-server: Interactive geographic data exploration
  • pdf-server: Document viewing with inline highlights
  • system-monitor-server: Real-time metrics dashboards
  • sheet-music-server: Music notation rendering

Production adoption (January 2026):

ToolProviderCapabilities
AsanaAsanaProject timelines, task boards
SlackSalesforceMessage drafting with formatting preview
FigmaFigmaFlowcharts, Gantt charts in FigJam
AmplitudeAmplitudeAnalytics charts with interactive filtering
BoxBoxFile search, document previews
CanvaCanvaPresentation design with real-time customization
ClayClayCompany research, contact discovery
HexHexData analysis with interactive queries
monday.commonday.comWork management boards

Coming soon: Salesforce (Agentforce 360)

MCP Apps standardizes patterns pioneered by:

  • MCP-UI: Early UI extension for MCP (community project)
  • OpenAI Apps SDK: Parallel effort for interactive tools

Both frameworks continue to be supported. MCP Apps provides a unified specification (SEP-1865) co-authored by maintainers from both ecosystems plus Anthropic and OpenAI.

Migration path: Straightforward for existing MCP-UI and Apps SDK implementations.

Decision tree for MCP server developers:

Building a custom MCP server?
├─ Users need to SELECT from 50+ options? → MCP Apps (dropdown, multi-select UI)
├─ Users need to VISUALIZE data patterns? → MCP Apps (charts, maps, graphs)
├─ Users need MULTI-STEP workflows with conditional logic? → MCP Apps (wizard forms)
├─ Users need REAL-TIME updates? → MCP Apps (live dashboards)
└─ Simple data retrieval or actions only? → Traditional MCP tools (sufficient)

Trade-off: UI complexity and implementation effort vs. user experience improvement.


Confidence: 100% (Tier 1 - Official) Source: anthropic.com/engineering/advanced-tool-use

Since v2.1.7 (January 2026), Claude Code uses lazy loading for MCP tool definitions instead of preloading all tools into context. This is powered by Anthropic’s Advanced Tool Use API feature.

The problem solved:

  • MCP tool definitions consume significant context (e.g., GitHub MCP alone: ~46K tokens for 93 tools)
  • Developer Scott Spence documented 66,000+ tokens consumed before typing a single prompt
  • This “context pollution” limited practical MCP adoption

How Tool Search works:

┌─────────────────────────────────────────────────────────────┐
│ MCP TOOL SEARCH FLOW │
├─────────────────────────────────────────────────────────────┤
│ │
│ WITHOUT Tool Search (eager loading): │
│ ┌──────────────────────────────────────────────────────┐ │
│ │All 100+ tool definitions loaded upfront (~55K tokens)│ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ WITH Tool Search (lazy loading): │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Step 1: Only search tool loaded (~500 tokens) │ │
│ │ Step 2: Claude determines needed capability │ │
│ │ Step 3: Tool Search finds matching tools (regex/BM25)│ │
│ │ Step 4: Only matched tools loaded (~600 tokens each) │ │
│ │ Step 5: Tool invoked normally │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ Result: 55K tokens → ~8.7K tokens (85% reduction) │
│ │
└─────────────────────────────────────────────────────────────┘

Measured improvements (Anthropic benchmarks):

MetricBeforeAfterImprovement
Token overhead (5-server setup)~55K~8.7K85% reduction
Opus 4 tool selection accuracy49%74%+25 points
Opus 4.5 tool selection accuracy79.5%88.1%+8.6 points
Opus 4.6 adaptive thinkingN/AAuto-calibratedDynamic depth

Configuration (v2.1.9+):

Terminal window
# Environment variable
ENABLE_TOOL_SEARCH=auto # Default (10% context threshold)
ENABLE_TOOL_SEARCH=auto:5 # Aggressive (5% threshold)
ENABLE_TOOL_SEARCH=auto:20 # Conservative (20% threshold)
ENABLE_TOOL_SEARCH=true # Always enabled
ENABLE_TOOL_SEARCH=false # Disabled (eager loading)
ThresholdRecommended for
auto:20Lightweight setups (5-10 tools)
auto:10Balanced default (20-50 tools)
auto:5Power users (100+ tools)

→ As Simon Willison noted: “Context pollution is why I rarely used MCP. Now that it’s solved, there’s no reason not to hook up dozens or even hundreds of MCPs to Claude Code.” — X/Twitter, January 14, 2026


Confidence: 90% (Tier 2 - Verified through behavior) Sources:

The Edit tool is more sophisticated than it appears.

┌─────────────────────────────────────────────────────────────┐
│ EDIT TOOL FLOW │
├─────────────────────────────────────────────────────────────┤
│ │
│ Input: old_string, new_string, file_path │
│ │
│ ┌──────────────────────────────────────┐ │
│ │ Step 1: EXACT MATCH │ │
│ │ Search for literal old_string │ │
│ └────────────────┬─────────────────────┘ │
│ │ │
│ Found? ────┴──── Not found? │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────┐ ┌──────────────────┐ │
│ │ REPLACE │ │ Step 2: FUZZY │ │
│ │ (done) │ │ MATCH │ │
│ └──────────┘ └────────┬─────────┘ │
│ │ │
│ Found? ────┴──── Not found? │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────┐ ┌──────────────┐ │
│ │ REPLACE │ │ ERROR │ │
│ │ + WARN │ │ (mismatch) │ │
│ └──────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘

When exact match fails, the Edit tool attempts:

  1. Whitespace normalization: Ignore trailing spaces, normalize indentation
  2. Line ending normalization: Handle CRLF vs LF differences
  3. Context expansion: Use surrounding lines to locate the right spot

If fuzzy matching also fails, the tool returns an error asking Claude to verify the old_string.

Before applying changes, the Edit tool:

CheckPurpose
File existsPrevent creating files via Edit
old_string foundEnsure we’re editing the right place
Single matchold_string must be unique (or use replace_all)
New content differsPrevent no-op edits
ErrorCauseClaude’s Response
”old_string not found”Content changed since last readRe-reads file, tries again
”Multiple matches”old_string isn’t uniqueUses more context lines
”File not found”Wrong pathSearches for correct path

Confidence: 100% (Tier 1 - Official) Source: code.claude.com/docs

Sessions can be resumed across terminal sessions.

CommandBehavior
claude --continue / claude -cResume most recent session
claude --resume <id> / claude -r <id>Resume specific session by ID
PersistedNot Persisted
Conversation historyLive tool state
Tool call resultsPending operations
Session IDFile locks
Working directory contextEnvironment variables

Confidence: 50% (Tier 3 - Inferred)

Sessions appear to be stored as JSON/JSONL files in ~/.claude/ but:

  • Format is not publicly documented
  • Not intended as a stable API
  • May change between versions

Do not rely on session file format for external tooling.


9. Philosophy: Less Scaffolding, More Model

Section titled “9. Philosophy: Less Scaffolding, More Model”

Confidence: 100% (Tier 1 - Official) Source: Daniela Amodei (Anthropic Co-founder & President) - Public statements

The core philosophy behind Claude Code:

“Do more with less. Smart architecture choices, better training efficiency, and focused problem-solving can compete with raw scale.”

Traditional ApproachClaude Code Approach
Intent classifier → Router → SpecialistSingle model decides everything
RAG with embeddingsGrep + Glob (regex search)
DAG task orchestrationSimple while loop
Tool-specific plannersModel-driven tool selection
Complex state machinesConversation as state
Prompt engineering frameworksTrust the model
  1. Model capability: Claude 4+ is capable enough to handle routing decisions
  2. Reduced latency: Fewer components = faster response
  3. Simpler debugging: When something fails, there’s one place to look
  4. Better generalization: No hand-coded rules to break on edge cases
AdvantageDisadvantage
SimplicityLess fine-grained control
FlexibilityHarder to enforce strict behaviors
Fewer bugsModel errors affect everything
Fast iterationRequires good model quality

The “native capabilities first” approach is increasingly validated by external practitioners. Embedded engineering teams (including former Cursor power users) converge on Agent Skills standard over external orchestration frameworks, demonstrating the viability of trusting Claude’s native reasoning over adding scaffolding layers.

Example: Gur Sannikov (embedded engineering) adopted ADR-driven workflows using only native Claude Code capabilities (hooks, skills, Task Tool) without external frameworks — validating the architectural philosophy documented in this guide.

This convergence suggests that the “less scaffolding, more model” approach scales beyond initial expectations, even for complex engineering domains like embedded systems development.


Confidence: 70% (Tier 3 - Based on public information) Sources: Various 2024-2025 comparisons, official documentation

DimensionClaude CodeGitHub Copilot WorkspaceCursorAmazon Q Developer
Architecturewhile(tool) loopCloud-based planningEvent-driven + cloudAWS-integrated agents
ExecutionLocal terminalCloud sandboxLocal + cloudCloud/local hybrid
ModelClaude (single)GPT-4 variantsMultiple (adaptive)Amazon Titan + others
Context~200K tokensLimitedLimitedVaries
TransparencyHigh (visible reasoning)MediumMediumLow
CustomizationCLAUDE.md + hooksLimitedPluginsAWS integration
MCP SupportNativeNoSome serversNo
PricingPro/Max tiersGitHub subscriptionPer-seatAWS-integrated
ScenarioClaude CodeAlternative
Deep codebase explorationExcellentGood
Terminal-native workflowExcellentLimited
Custom automation (hooks)ExcellentLimited
Team standardizationGood (CLAUDE.md)Varies
IDE integrationLimited (VS Code ext)Cursor/Copilot better
Enterprise complianceVia Anthropic enterpriseVaries

SourceURLTopics
Engineering Bloganthropic.com/engineering/claude-code-best-practicesMaster loop, philosophy
Setup Docscode.claude.com/docs/en/setupTools, commands
Context Windowsplatform.claude.com/docs/en/build-with-claude/context-windowsToken limits
Hooks Referencecode.claude.com/docs/en/hooksHook system
Hooks Guidecode.claude.com/docs/en/hooks-guideHook examples
MCP Docscode.claude.com/docs/en/mcpMCP integration
Sandboxingcode.claude.com/docs/en/sandboxingSecurity model
llms.txt (index)code.claude.com/docs/llms.txtLLM-optimized doc index, ~65 pages
llms-full.txtcode.claude.com/docs/llms-full.txtFull documentation (~98 KB text)
SourceURLTopics
PromptLayer Analysisblog.promptlayer.com/claude-code-behind-the-scenes-of-the-master-agent-loop/Internal architecture
Steve Kinney Coursestevekinney.com/courses/ai-development/claude-code-*Permissions, sessions
SourceTopics
GitHub Issues (anthropics/claude-code)Edge cases, bugs, feature discussions
Reddit r/ClaudeAIUser experiences, workarounds
YouTube tutorialsVisual walkthroughs

Transparency about gaps in our understanding:

TopicWhat We Don’t KnowConfidence in Current Understanding
Exact compaction thresholdIs it 75%? 85%? 92%? Varies by model?40%
System prompt contentsFull text not public, varies by model version30%
Token counting methodExact tokenizer, overhead for tool schemas50%
Model fallbackDoes Claude Code fallback if a model fails?20%
Internal cachingIs there result caching between sessions?20%
Rate limiting logicHow rate limits are applied per-tool40%

These are intentionally not documented by Anthropic:

  • Session file format (internal implementation detail)
  • System prompt variations between models
  • Internal component names/architecture
  • Token usage breakdown per component
  • Exact permission evaluation order
  1. Official changelog: Watch anthropic.com/changelog
  2. GitHub releases: github.com/anthropics/claude-code/releases
  3. Community Discord: Various Claude-focused servers
  4. This guide: Updated periodically based on verified information

Found an error? Have verified new information? Contributions welcome:

  1. For official facts: Cite the Anthropic source
  2. For observations: Describe how you verified the behavior
  3. For corrections: Explain what’s wrong and why

Last updated: February 2026 Claude Code version: v2.1.34 Document version: 1.1.0