MCP vs CLI — Decision Guide
MCP vs CLI — Decision Guide
Section titled “MCP vs CLI — Decision Guide”Last updated: May 2026
Interactive version with guidance table and practitioner quotes: cc.bruniaux.com/ecosystem/mcp-vs-cli/
The debate emerged from a rapid succession of interface paradigms: browser-based AI (2022-23), then AI in the IDE with MCP connecting agents to external services (2024-25), then full CLI agents that execute commands and write files without an intermediary layer (2025-26). That progression explains why the question exists at all.
This page compares two integration patterns for giving Claude Code access to external tools and services: MCP servers and CLI tools. Neither is universally better. The right choice depends on your context — and most real workflows end up using both.
What each approach does
Section titled “What each approach does”MCP servers inject tool schemas into Claude’s context at session start. Claude sees a structured list of available tools with parameters, types, and descriptions. It then calls those tools natively, receiving structured responses.
CLI tools are shell commands that Claude invokes via Bash. Claude drives them the same way a developer would: constructing command strings, parsing text output. No schema injection at startup. The shell is the interface.
Tradeoffs
Section titled “Tradeoffs”MCP strengths
Section titled “MCP strengths”| Advantage | Detail |
|---|---|
| Structured interface | Tool schemas guide Claude precisely — fewer hallucinated flags or arguments |
| Complex auth | OAuth, token refresh, secrets rotation handled by the server, not the prompt |
| Structured output | JSON responses are directly parseable by Claude and downstream agents |
| Observability | Remote MCP servers can log every call — essential for enterprise usage tracking and ROI attribution |
| Distribution at scale | Update the server once, all connected clients get the change. No per-machine package management. |
| Non-technical users | Users who never touch a terminal can access tools transparently via MCP connectors |
| Weaker models | A structured schema compensates when the model is less capable of parsing CLI help text |
CLI strengths
Section titled “CLI strengths”| Advantage | Detail |
|---|---|
| Zero context overhead | No schema injected at startup. Since v2.1.7 lazy loading closes most of the gap, but CLI is still the absolute minimum. |
| Deterministic actions | Explicit commands with predictable output are easier to audit and test |
| Human + AI use | The same CLI wrapper works for a developer running it manually and for Claude |
| Frontier models | Claude Opus/Sonnet 4.6 can drive complex CLIs (aws-cli, glab, gh) without a structured schema |
| Speed | No connection setup, no MCP handshake — direct subprocess execution |
| Simplicity | Easier to debug, log, and reason about than a remote server call chain |
| Skills encapsulation | A CLI wrapped in a skill is transparent to the user and keeps the tool logic version-controlled |
MCP weaknesses
Section titled “MCP weaknesses”| Weakness | Detail |
|---|---|
| Schema token cost | Since v2.1.7, lazy loading (MCP Tool Search) means unused tools inject only their name, not their full schema. Cost is still non-zero: tool names load at startup, full schemas load on first use. The pre-v2.1.7 worst case (~55K tokens for a 5-server setup) now averages ~8.7K — an 85% reduction, but not zero. |
| Connection overhead | Session startup takes longer with many MCP servers connected |
| Debugging difficulty | Failures inside an MCP server are harder to trace than a failed shell command |
| Maintenance complexity | Running, updating, and securing remote MCP servers adds infrastructure |
| Overkill for simple APIs | A GitLab MCP that surfaces 20% of glab’s functionality is worse than glab itself |
CLI weaknesses
Section titled “CLI weaknesses”| Weakness | Detail |
|---|---|
| No observability | Shell commands on a local machine are invisible to ops/management tooling |
| Distribution problem | Keeping CLIs updated across a team requires package management discipline (brew, scoop, etc.) |
| Weaker models struggle | A less capable model may hallucinate flags or misread help text — schemas help |
| No multi-agent structure | CLI output requires parsing; structured MCP responses are more reliable across agent-to-agent handoffs |
| Non-tech user barrier | A non-technical user cannot be expected to have a configured CLI environment |
The API wrapper pattern
Section titled “The API wrapper pattern”Most production MCP servers for SaaS tools sit on top of existing REST or GraphQL APIs. The server translates tool calls into HTTP requests against those APIs, processes responses, and returns structured output to the agent. It does not add backend capabilities that the underlying API lacks.
Official documentation from four major providers confirms this directly:
- Notion: “converted MCP tool calls into HTTP API calls to Notion’s public API” (Notion engineering blog)
- Sentry: “middleware to the upstream Sentry API, optimized for coding assistants like Cursor and Claude Code” (sentry-mcp README)
- Slack: “a wrapper around an external API, like Slack” (Slack developer docs)
- GitHub: “integrates with GitHub, allowing LLMs to interact with repositories via the GitHub API” (github-mcp-server README)
The practical consequence: a CLI script calling the same REST or GraphQL API has the same capabilities. MCP and CLI reach the same backend, with the same credentials, triggering the same operations. The difference is the interface layer, not what the backend can do.
What MCP adds that a CLI cannot replicate:
- OAuth token management: server-held auth with browser redirect flows (Slack, Google Drive, Figma, hosted Notion). A CLI can hold an API key in an env var, but not a refresh token or a PKCE exchange, which require server-side state.
- LLM-tuned schemas: curated tool selection, not the full API surface, with parameter types and descriptions calibrated for agent reasoning.
- Centralized hosting: one deployment serves many clients; CLIs require per-machine installation and per-user credential configuration.
- Usage attribution: remote MCP servers associate each tool call with a user and session, feeding observability dashboards. Local CLI calls on developer machines are invisible.
This sharpens the decision criterion. If a service authenticates via API key or environment token, a CLI calling the same API is functionally equivalent to an MCP server. The question becomes whether you need OAuth, centralized observability, or cross-client standardization. If none of those apply, the CLI avoids the schema overhead and reaches the same backend.
The four decision dimensions
Section titled “The four decision dimensions”Before asking “MCP or CLI?”, answer these four questions. They rank from most to least constraining.
1. Who is the end user?
Section titled “1. Who is the end user?”This is the dominant variable. Everything else is secondary.
- Non-technical user (using a chat interface, no terminal) → MCP or skill-encapsulated CLI. You cannot expose a raw CLI to a non-dev user. Connectors must be MCP-based or wrapped invisibly in a skill that handles the CLI internally.
- Technical user / developer → continue to question 2.
2. Which model is driving the tool?
Section titled “2. Which model is driving the tool?”- Frontier model (Claude Opus/Sonnet 4.6) → strong enough to drive complex CLIs directly. A structured MCP schema adds overhead without proportional benefit.
- Smaller or local model (Qwen, Mistral, lighter deployments) → structured MCP schemas compensate for weaker CLI parsing ability. MCP is more reliable here.
3. Does your organization need observability?
Section titled “3. Does your organization need observability?”- Yes (enterprise, C-level reporting, compliance, ROI attribution on AI spend) → MCP Remote server. Local CLI calls are invisible. A remote MCP server can log every tool invocation, associate it with a user, and feed dashboards. You cannot replicate this with CLIs on local machines.
- No (individual dev, local workflow) → observability is not a constraint. CLI is fine.
4. How often does the tool schema change?
Section titled “4. How often does the tool schema change?”- Stable API (mature tool, versioned interface) → MCP investment pays off over time.
- Rapidly changing or thin wrapper → CLI is cheaper to maintain. A hand-rolled glab wrapper that exposes only the 5 commands you actually use is more durable than a GitLab MCP that duplicates the full API surface.
Guidance by situation
Section titled “Guidance by situation”Quick reference — not rules, but directional defaults.
| Situation | Lean toward | Rationale |
|---|---|---|
| Non-technical user, chat interface | MCP / Skill | CLI is inaccessible; connectors must be invisible |
| Frontier model (Claude 4.x), developer workflow | CLI | Model handles it natively; schemas are overhead |
| Smaller/local model | MCP | Schema guides the model reliably |
| Enterprise, observability required | MCP Remote | Only way to log, attribute, and report on usage |
| Team distribution (10+ devs) | MCP | Central update vs per-machine CLI maintenance |
| Individual dev, local machine | CLI or skill | Simpler, faster, no infrastructure |
| Deterministic actions (git, CI, deploy) | CLI | Explicit commands, predictable output, auditable |
| Complex auth (OAuth, token refresh) | MCP | Server handles auth; CLI would require credential plumbing |
| Tight context budget / many tools loaded | CLI | Still the minimum-overhead option. Lazy loading (v2.1.7+) reduces MCP cost significantly, but CLI has zero schema cost by design. |
| Agent-to-agent structured output | MCP | JSON responses are more reliable than parsed CLI text |
| Debugging / prototyping a new integration | CLI | Easier to inspect, faster to iterate |
| Browser automation (non-frontier model) | MCP | Playwright MCP structures interaction reliably |
| Browser automation (frontier model, Claude Code) | CLI + skill | playwright-cli + skill reported faster and more efficient in practice |
| GitLab / GitHub access | CLI (glab, gh) | Official CLIs are richer than most MCP wrappers |
| Documentation lookup (Context7) | MCP | No CLI equivalent; structured doc retrieval has no shell analog |
Per-server recommendation
Section titled “Per-server recommendation”The table below applies the four decision dimensions to the 18 most commonly discussed MCP servers. “Verdict” is the default for a developer using a frontier model on a local machine. Your context (non-technical users, enterprise observability, or a smaller model) may shift any row toward MCP.
| MCP Server | Verdict | CLI Alternative | Reason |
|---|---|---|---|
| GitHub MCP | Use CLI | gh | gh covers the full API surface; model knows it from training; official GitHub MCP was archived |
| GitLab MCP | Use CLI | glab | Official CLI is richer than the MCP wrapper; practitioner consensus confirms |
| Git MCP (Anthropic) | Use CLI | git | Git is the CLI the model knows best; MCP schema adds cost without structural benefit on frontier models |
| Filesystem MCP | Use CLI | cat, ls, find | Shell commands are universal; no benefit from schema overhead |
| Docker MCP | Use CLI | docker | Docker CLI is universally known; no widely adopted MCP adds comparable value |
| AWS MCP | Use CLI | aws-cli | aws-cli v2 covers the full surface; model drives it natively from training knowledge |
| Terraform MCP | Use CLI | terraform | Deterministic plan/apply workflow; CLI output is structured and auditable |
| Semgrep MCP | Use CLI | semgrep | Mature CLI, well-documented; MCP adds value mainly in CI/CD observability contexts |
| Playwright MCP | Depends | playwright-cli + skill | Frontier model: CLI + skill is faster. Smaller model: MCP structures browser interaction reliably |
| Kubernetes MCP | Depends | kubectl | Auth complexity and multi-cluster setups favor MCP; simple operations favor kubectl |
| Vercel MCP | Depends | vercel CLI | CLI for deploy, env, and domains; MCP for dashboard integration and team workflow comments |
| Sentry MCP | Use MCP | sentry-cli (CI/CD scoped) | sentry-cli handles releases, source map uploads, and CI/CD ops, but has no equivalent for interactive issue querying. MCP provides structured error triage for coding agents. |
| Slack MCP | Use MCP | none | OAuth required; no practical CLI for workspace access from an agent |
| Notion MCP | Use MCP | none | OAuth required; API-key access is limited to integrations, not user-scoped workspace access |
| Google Drive MCP | Use MCP | none | OAuth 2.1 with refresh token rotation; cannot be replicated by a skill or CLI |
| Figma MCP | Use MCP | none | OAuth required; design file access has no CLI equivalent |
| Linear MCP | Use MCP | none | MCP handles GraphQL complexity; structured project management without raw API calls |
| Context7 MCP | Use MCP | none | No CLI equivalent for curated, version-specific doc retrieval |
The pattern: if the service has a mature CLI the model knows from training, use the CLI. If the service requires OAuth or has no CLI, use MCP. “Depends” means the decision hinges on model capability or specific workflow needs.
Take the interactive quiz (6 questions, under 1 minute): cc.bruniaux.com/mcp-or-cli/
The hybrid is the default
Section titled “The hybrid is the default”Most production workflows don’t choose one. They use both, with each covering the layer it handles best.
A practical example (from practitioners):
- Inner layer (local dev iteration, git, file ops, shell scripts) → CLI, fast, deterministic, no overhead
- Outer layer (CI/CD, shared infrastructure, cross-team services) → MCP Remote, observable, centralized, scalable
- Skill layer (user-facing actions, CLI tools encapsulated for non-tech users) → CLIs wrapped in skills, transparent to the end user
The mistake is applying one answer to both layers. A solo developer building a Claude Code workflow for themselves should mostly use CLIs. A team deploying an AI assistant to non-technical colleagues should mostly use MCP.
Token cost of MCP schemas — what the numbers look like
Section titled “Token cost of MCP schemas — what the numbers look like”Since v2.1.7 (January 2026), Claude Code uses MCP Tool Search (lazy loading) by default. This changes the token math significantly, but does not eliminate schema cost entirely.
How lazy loading works: instead of injecting all tool schemas at session start, Claude receives only tool names in an <available-deferred-tools> block. Full schemas are fetched via ToolSearch only when Claude decides to call a specific tool. Unused tools in a session cost only their name in context (~0 schema tokens), not the full definition.
Measured impact (Anthropic benchmarks, 5-server setup):
| Scenario | Token overhead | Note |
|---|---|---|
| Before v2.1.7 (eager loading) | ~55,000 tokens | All schemas preloaded |
| After v2.1.7 (lazy loading) | ~8,700 tokens | 85% reduction |
| CLI (no MCP) | ~0 tokens | Baseline |
The old worst-case claim of “500-2,000 tokens per server” described eager loading, which is no longer the default. With lazy loading, the cost per unused server is near zero. The cost per used server (~600 tokens per tool schema loaded on demand) remains real, but is now pay-per-use rather than always-on.
What still adds overhead even with lazy loading:
- Tool names are still injected at startup (one line per tool per server)
- Schemas load at first invocation — long sessions using many tools accumulate cost
- Connection setup per server is unchanged (latency, not tokens)
- Many connected MCP servers still means more names in context, even if schemas stay deferred
Configuration (v2.1.9+): the ENABLE_TOOL_SEARCH environment variable controls the threshold. auto:N triggers lazy loading when MCP tools exceed N% of context (default 10%).
Mitigation strategies (still relevant, lower urgency):
- Load MCP servers selectively per project (project-level config vs global config)
- Use CLI tools for high-frequency tight loops where any overhead compounds (compile → test → fix)
- Monitor token usage per session to identify which schemas are being loaded at invocation time
- Consider a CLI wrapper for tools you use constantly but don’t need structured output from
Tooling in this space
Section titled “Tooling in this space”| Tool | What it does | Status |
|---|---|---|
| RTK (Rust Token Killer) | Filters CLI output before it reaches Claude’s context — reduces response verbosity, not schema overhead | Production-ready, actively maintained |
| MCPorter (steipete) | TypeScript runtime for calling MCP servers from scripts, generating CLI wrappers, and emitting typed TS clients. Useful for testing MCP servers and writing hooks that need MCP access. | 3K stars, MIT, 2+ weeks, ready to use |
| mcp2cli (knowsuchagency) | Converts MCP/OpenAPI/GraphQL to runtime CLI, eliminating schema injection. Benchmarked at 32× token reduction on the 43-tool GitHub MCP server (44K → 1.4K tokens). | ~1.9K stars, Show HN Best of March 2026 — production-viable for remote MCP servers with 10+ tools. See full breakdown. |
Note on mcp2cli: the token savings are real for direct API use, remote MCP servers, and CI/CD pipelines — benchmarked independently by Firecrawl, Scalekit, and CircleCI. For standard Claude Code sessions where lazy loading (v2.1.7+) already defers most schemas, the gain is smaller. mcp2cli applies most clearly when you drive MCP tools from scripts or agents that don’t have deferred loading built in.
MCP vs Skills
Section titled “MCP vs Skills”Skills (.claude/skills/*.md) are a third integration paradigm — distinct from both MCP servers and CLI tools. Conflating them with CLIs is the most common framing error in this space.
What each layer does:
- Skills encode how the agent should behave — step-by-step workflows, decision trees, and SOPs written in markdown. They are loaded on demand into the agent’s context and guide its reasoning without injecting external tool schemas.
- MCP servers provide structured access to external systems — APIs, databases, file systems — with typed tool interfaces the agent calls directly.
- CLI tools provide command-line access to external systems — the agent constructs shell commands and parses text output.
Skills and MCP address different layers, not the same problem. A skill can describe when and how to invoke an MCP tool (check this field, then call that tool) while the MCP server handles the actual connection. Asking “should I write a skill or an MCP server?” usually means the layers are being conflated.
The OAuth boundary
Section titled “The OAuth boundary”This is MCP’s clearest structural advantage over skills, and it’s not a matter of convenience.
A skill can instruct an agent to “authenticate with Google Drive before proceeding.” What it cannot do is hold a refresh token, complete a browser redirect, or manage a PKCE exchange. Those operations require server-side state, which a markdown file does not have.
Enterprise SaaS APIs — Google Workspace, Salesforce, Slack, GitHub — require OAuth 2.1 with refresh token rotation. When that is the authentication mechanism, MCP is not just more convenient: it is the only option that works without asking the user to paste credentials manually at the start of every session.
Practical test: if the service authenticates via an API key in a header, a skill or CLI can handle it. If it requires a browser redirect or server-held refresh token, that belongs in MCP.
Community consensus (2026)
Section titled “Community consensus (2026)”The debate has largely settled on a three-layer model rather than a binary choice:
- Skills handle what to do and when — workflow orchestration, decision guidance, reproducible agent behavior
- MCP handles connectivity and auth — external systems that require structured interfaces, OAuth, or enterprise observability
- CLI handles deterministic local operations — git, file ops, test runners, anything the model can drive directly from training knowledge
The convergence is now part of the spec: SEP-2640 (“Skills Over MCP”) proposes distributing skills as MCP resources, so users install workflows the same way they install tool servers. The two paradigms are being unified rather than forced to compete.
What practitioners say
Section titled “What practitioners say”A few representative perspectives from experienced Claude Code users:
“I prefer CLI for deterministic actions. For GitLab interactions I use glab (the GitLab MCP is too limited) wrapped in a custom CLI — usable by both humans and AI.” — practitioner
“On Claude Code with frontier models, fewer MCPs is better. I replaced playwright-mcp with playwright-cli + skill — faster and more effective. I still use context7-mcp only because I haven’t found a CLI equivalent.” — practitioner
“The CLI vs MCP debate is only happening among devs doing dev things. But there’s one fundamental constraint: you cannot propose a CLI solution to a non-technical user who just wants to use their tool simply.” — practitioner
“For enterprise industrialization, observability is non-negotiable. CLI on a local machine is a black box. MCP Remote gives you the logging that C-levels need to attribute investment.” — practitioner
“Frontier models are strong enough to drive a CLI directly. A weaker local model will struggle — that’s where MCP schemas earn their overhead.” — practitioner
Back to MCP Servers Ecosystem | Third-Party Tools | Main guide