Context Window: 200K vs 1M — Claude Code Recap Card T19

The Two Windows

Parameter	200K standard	1M GA
Availability	All plans	GA for Max/Team/Enterprise CC plans (v2.1.75)
Header required	None (CC plans) · API header for direct API access	`anthropic-beta: context-1m-2025-08-07`
Price (Opus input)	$5/MTok	$10/MTok
Max output	128K tokens	128K tokens

Above 200K input tokens, all context tokens are billed at the premium rate, not just the excess. This is a cost threshold, not a linear progression.

Precision at Scale (MRCR v2)

Model	At 256K	At 1M
Opus 4.6	93%	76%
Sonnet 4.5	n/a	18.5%

Opus 4.6 remains usable at 1M (76% precision), but degradation is measurable. Sonnet collapses and is not recommended beyond 200K for precise tasks.

Cost per Session (Approximate)

Session type	Tokens in	Sonnet 4.6	Opus 4.6
PR review (≤200K)	50K	~$0.23	~$0.38
Refactoring (≤200K)	150K	~$0.75	~$1.25
Service analysis (>200K)	500K	~$4.13	~$6.88

When to Use 1M

The community rule: 200K + RAG by default, 1M Opus reserved for cases where loading everything at once is genuinely necessary.

Justified:

Full codebase audit in a single pass
Massive documentation analysis with no chunking possible
Agent Teams on a complex multi-service architecture

Not justified:

Day-to-day development (even on large projects)
Tasks with fast feedback loops (tests, debugging)
Cases where /compact + sequential sessions work fine

Activation (API)

response = client.messages.create(
    model="claude-opus-4-6",
    extra_headers={
        "anthropic-beta": "context-1m-2025-08-07"
    },
    messages=[...]
)

For direct API access only. Claude Code Max/Team/Enterprise plans have 1M enabled automatically — no header needed. Without this header on direct API calls, requests exceeding 200K tokens return an error even on tier 4 accounts.

Recommended Pattern

Work at 200K with proactive /compact (at 70% context usage) rather than enabling 1M by default. Open a new session around 70-75% usage: performance is better and cost stays predictable.

For RAG on large documents, Gemini 1.5 Pro offers 2M context at $3.50/$10.50 per MTok, roughly 2-3x cheaper for pure retrieval without needing Opus-level reasoning.