Skip to main content
Code Guide
C11 Intermediate Design

Subscription vs API: Cost Patterns

Understanding when each billing model is advantageous

PDF
← All cards

Two fundamentally different models

Subscription: you pay a fixed monthly amount and consume within a token budget limit. The cost is predictable, but the budget is capped and the limits are deliberately opaque.

API: per-token billing, no cap. The cost is proportional to usage. Ideal for irregular or very high usage, but without protection against overruns.

The practical distinction: subscription favors solo developers with regular daily usage, API favors teams, CI/CD pipelines, and high-volume use cases.

How subscription limits work

Subscriptions use a hybrid model: a 5-hour rolling window and a weekly cap. Both apply simultaneously. Billing is announced in “messages” but actually corresponds to tokens, making limits difficult to anticipate.

Critical point: the Opus/Sonnet ratio. Opus consumes 8 to 10 times more quota than Sonnet for equivalent work. A Max 20x user who uses Opus intensively can exhaust their weekly quota in 24 to 40 hours of real work, despite a premium subscription.

Strategies by plan

PlanRecommendation
ProSonnet only, batch sessions, avoid context bloat
Max 5xSonnet by default, Opus for architecture and complex debug
Max 20xMore Opus freedom, but monitor weekly consumption
APIHaiku for mechanical tasks, Sonnet for dev, Opus for audit

The OpusPlan pattern for saving

On a limited budget, the most effective pattern is to use Opus for planning (where reasoning quality matters), then Sonnet or Haiku for mechanical execution.

Terminal window
# Phase 1: planning (Opus)
claude --model opus
"Analyze the architecture and propose a migration plan"
# Phase 2: execution (Sonnet/Haiku)
claude --model sonnet
"Implement the plan according to the defined specs"

This pattern is automated via /model opusplan.

Consumption monitoring

Terminal window
/usage # Cost, token counts + per-model breakdown (v2.1.118)
/cost # *(alias for /usage since v2.1.118)*
/stats # *(alias for /usage since v2.1.118)*
/status # Model + context + summarized cost
# Community tool for cross-session history
ccusage # Overview all periods
ccusage --today # Today's cost
ccusage --month # Monthly cost
ccusage --model-breakdown # By model

Anthropic does not provide real-time metrics in the interface. ccusage fills this gap.

Default effort = high on Pro and Max plans (v2.1.117): Claude now runs at high effort by default on paid subscriptions, which means more thorough reasoning per request. This can increase token consumption compared to earlier versions. On CI pipelines or budget-sensitive workflows, set CLAUDE_EFFORT=low explicitly.

Bedrock users (v2.1.122): set ANTHROPIC_BEDROCK_SERVICE_TIER in your environment or env block in settings.json to control service tier routing when using AWS Bedrock as the API backend.

When API wins

Subscription hits its limits in three cases: teams where multiple developers consume simultaneously, CI/CD pipelines with a high PR volume, and projects with long daily sessions that exhaust the budget before the end of the week. In these contexts, API billing (with Haiku for mechanical tasks) often ends up cheaper than expected, especially with a progressive escalation strategy.

Typical API costs (order of magnitude)

Session typeSonnetOpus
Bug fix / PR review~$0.23~$0.38
Module refactoring~$0.75~$1.25
CI/CD review per PR (Haiku)~$0.02N/A

Enter your email to read the full card and get the complete PDF bundle.

All content is free and open-source. We just ask for your email.

PDF: