Skip to main content
Code Guide
T19 Intermediate Technical

Context Window: 200K vs 1M

When to switch to the extended context window and at what cost

PDF
← All cards

The Two Windows

Parameter200K standard1M GA
AvailabilityAll plansGA for Max/Team/Enterprise CC plans (v2.1.75)
Header requiredNone (CC plans) · API header for direct API accessanthropic-beta: context-1m-2025-08-07
Price (Opus input)$5/MTok$10/MTok
Max output128K tokens128K tokens

Above 200K input tokens, all context tokens are billed at the premium rate, not just the excess. This is a cost threshold, not a linear progression.

Precision at Scale (MRCR v2)

ModelAt 256KAt 1M
Opus 4.693%76%
Sonnet 4.5n/a18.5%

Opus 4.6 remains usable at 1M (76% precision), but degradation is measurable. Sonnet collapses and is not recommended beyond 200K for precise tasks.

Cost per Session (Approximate)

Session typeTokens inSonnet 4.6Opus 4.6
PR review (≤200K)50K~$0.23~$0.38
Refactoring (≤200K)150K~$0.75~$1.25
Service analysis (>200K)500K~$4.13~$6.88

When to Use 1M

The community rule: 200K + RAG by default, 1M Opus reserved for cases where loading everything at once is genuinely necessary.

Justified:

  • Full codebase audit in a single pass
  • Massive documentation analysis with no chunking possible
  • Agent Teams on a complex multi-service architecture

Not justified:

  • Day-to-day development (even on large projects)
  • Tasks with fast feedback loops (tests, debugging)
  • Cases where /compact + sequential sessions work fine

Activation (API)

response = client.messages.create(
model="claude-opus-4-6",
extra_headers={
"anthropic-beta": "context-1m-2025-08-07"
},
messages=[...]
)

For direct API access only. Claude Code Max/Team/Enterprise plans have 1M enabled automatically — no header needed. Without this header on direct API calls, requests exceeding 200K tokens return an error even on tier 4 accounts.

Work at 200K with proactive /compact (at 70% context usage) rather than enabling 1M by default. Open a new session around 70-75% usage: performance is better and cost stays predictable.

For RAG on large documents, Gemini 1.5 Pro offers 2M context at $3.50/$10.50 per MTok, roughly 2-3x cheaper for pure retrieval without needing Opus-level reasoning.

Enter your email to read the full card and get the complete PDF bundle.

All content is free and open-source. We just ask for your email.

PDF: