Skip to content
Code Guide

MCP vs CLI — Decision Guide

Last updated: May 2026

Interactive version with guidance table and practitioner quotes: cc.bruniaux.com/ecosystem/mcp-vs-cli/

The debate emerged from a rapid succession of interface paradigms: browser-based AI (2022-23), then AI in the IDE with MCP connecting agents to external services (2024-25), then full CLI agents that execute commands and write files without an intermediary layer (2025-26). That progression explains why the question exists at all.

This page compares two integration patterns for giving Claude Code access to external tools and services: MCP servers and CLI tools. Neither is universally better. The right choice depends on your context — and most real workflows end up using both.


MCP servers inject tool schemas into Claude’s context at session start. Claude sees a structured list of available tools with parameters, types, and descriptions. It then calls those tools natively, receiving structured responses.

CLI tools are shell commands that Claude invokes via Bash. Claude drives them the same way a developer would: constructing command strings, parsing text output. No schema injection at startup. The shell is the interface.


AdvantageDetail
Structured interfaceTool schemas guide Claude precisely — fewer hallucinated flags or arguments
Complex authOAuth, token refresh, secrets rotation handled by the server, not the prompt
Structured outputJSON responses are directly parseable by Claude and downstream agents
ObservabilityRemote MCP servers can log every call — essential for enterprise usage tracking and ROI attribution
Distribution at scaleUpdate the server once, all connected clients get the change. No per-machine package management.
Non-technical usersUsers who never touch a terminal can access tools transparently via MCP connectors
Weaker modelsA structured schema compensates when the model is less capable of parsing CLI help text
AdvantageDetail
Zero context overheadNo schema injected at startup. Since v2.1.7 lazy loading closes most of the gap, but CLI is still the absolute minimum.
Deterministic actionsExplicit commands with predictable output are easier to audit and test
Human + AI useThe same CLI wrapper works for a developer running it manually and for Claude
Frontier modelsClaude Opus/Sonnet 4.6 can drive complex CLIs (aws-cli, glab, gh) without a structured schema
SpeedNo connection setup, no MCP handshake — direct subprocess execution
SimplicityEasier to debug, log, and reason about than a remote server call chain
Skills encapsulationA CLI wrapped in a skill is transparent to the user and keeps the tool logic version-controlled
WeaknessDetail
Schema token costSince v2.1.7, lazy loading (MCP Tool Search) means unused tools inject only their name, not their full schema. Cost is still non-zero: tool names load at startup, full schemas load on first use. The pre-v2.1.7 worst case (~55K tokens for a 5-server setup) now averages ~8.7K — an 85% reduction, but not zero.
Connection overheadSession startup takes longer with many MCP servers connected
Debugging difficultyFailures inside an MCP server are harder to trace than a failed shell command
Maintenance complexityRunning, updating, and securing remote MCP servers adds infrastructure
Overkill for simple APIsA GitLab MCP that surfaces 20% of glab’s functionality is worse than glab itself
WeaknessDetail
No observabilityShell commands on a local machine are invisible to ops/management tooling
Distribution problemKeeping CLIs updated across a team requires package management discipline (brew, scoop, etc.)
Weaker models struggleA less capable model may hallucinate flags or misread help text — schemas help
No multi-agent structureCLI output requires parsing; structured MCP responses are more reliable across agent-to-agent handoffs
Non-tech user barrierA non-technical user cannot be expected to have a configured CLI environment

Most production MCP servers for SaaS tools sit on top of existing REST or GraphQL APIs. The server translates tool calls into HTTP requests against those APIs, processes responses, and returns structured output to the agent. It does not add backend capabilities that the underlying API lacks.

Official documentation from four major providers confirms this directly:

  • Notion: “converted MCP tool calls into HTTP API calls to Notion’s public API” (Notion engineering blog)
  • Sentry: “middleware to the upstream Sentry API, optimized for coding assistants like Cursor and Claude Code” (sentry-mcp README)
  • Slack: “a wrapper around an external API, like Slack” (Slack developer docs)
  • GitHub: “integrates with GitHub, allowing LLMs to interact with repositories via the GitHub API” (github-mcp-server README)

The practical consequence: a CLI script calling the same REST or GraphQL API has the same capabilities. MCP and CLI reach the same backend, with the same credentials, triggering the same operations. The difference is the interface layer, not what the backend can do.

What MCP adds that a CLI cannot replicate:

  • OAuth token management: server-held auth with browser redirect flows (Slack, Google Drive, Figma, hosted Notion). A CLI can hold an API key in an env var, but not a refresh token or a PKCE exchange, which require server-side state.
  • LLM-tuned schemas: curated tool selection, not the full API surface, with parameter types and descriptions calibrated for agent reasoning.
  • Centralized hosting: one deployment serves many clients; CLIs require per-machine installation and per-user credential configuration.
  • Usage attribution: remote MCP servers associate each tool call with a user and session, feeding observability dashboards. Local CLI calls on developer machines are invisible.

This sharpens the decision criterion. If a service authenticates via API key or environment token, a CLI calling the same API is functionally equivalent to an MCP server. The question becomes whether you need OAuth, centralized observability, or cross-client standardization. If none of those apply, the CLI avoids the schema overhead and reaches the same backend.


Before asking “MCP or CLI?”, answer these four questions. They rank from most to least constraining.

This is the dominant variable. Everything else is secondary.

  • Non-technical user (using a chat interface, no terminal) → MCP or skill-encapsulated CLI. You cannot expose a raw CLI to a non-dev user. Connectors must be MCP-based or wrapped invisibly in a skill that handles the CLI internally.
  • Technical user / developer → continue to question 2.
  • Frontier model (Claude Opus/Sonnet 4.6) → strong enough to drive complex CLIs directly. A structured MCP schema adds overhead without proportional benefit.
  • Smaller or local model (Qwen, Mistral, lighter deployments) → structured MCP schemas compensate for weaker CLI parsing ability. MCP is more reliable here.

3. Does your organization need observability?

Section titled “3. Does your organization need observability?”
  • Yes (enterprise, C-level reporting, compliance, ROI attribution on AI spend) → MCP Remote server. Local CLI calls are invisible. A remote MCP server can log every tool invocation, associate it with a user, and feed dashboards. You cannot replicate this with CLIs on local machines.
  • No (individual dev, local workflow) → observability is not a constraint. CLI is fine.
  • Stable API (mature tool, versioned interface) → MCP investment pays off over time.
  • Rapidly changing or thin wrapper → CLI is cheaper to maintain. A hand-rolled glab wrapper that exposes only the 5 commands you actually use is more durable than a GitLab MCP that duplicates the full API surface.

Quick reference — not rules, but directional defaults.

SituationLean towardRationale
Non-technical user, chat interfaceMCP / SkillCLI is inaccessible; connectors must be invisible
Frontier model (Claude 4.x), developer workflowCLIModel handles it natively; schemas are overhead
Smaller/local modelMCPSchema guides the model reliably
Enterprise, observability requiredMCP RemoteOnly way to log, attribute, and report on usage
Team distribution (10+ devs)MCPCentral update vs per-machine CLI maintenance
Individual dev, local machineCLI or skillSimpler, faster, no infrastructure
Deterministic actions (git, CI, deploy)CLIExplicit commands, predictable output, auditable
Complex auth (OAuth, token refresh)MCPServer handles auth; CLI would require credential plumbing
Tight context budget / many tools loadedCLIStill the minimum-overhead option. Lazy loading (v2.1.7+) reduces MCP cost significantly, but CLI has zero schema cost by design.
Agent-to-agent structured outputMCPJSON responses are more reliable than parsed CLI text
Debugging / prototyping a new integrationCLIEasier to inspect, faster to iterate
Browser automation (non-frontier model)MCPPlaywright MCP structures interaction reliably
Browser automation (frontier model, Claude Code)CLI + skillplaywright-cli + skill reported faster and more efficient in practice
GitLab / GitHub accessCLI (glab, gh)Official CLIs are richer than most MCP wrappers
Documentation lookup (Context7)MCPNo CLI equivalent; structured doc retrieval has no shell analog

The table below applies the four decision dimensions to the 18 most commonly discussed MCP servers. “Verdict” is the default for a developer using a frontier model on a local machine. Your context (non-technical users, enterprise observability, or a smaller model) may shift any row toward MCP.

MCP ServerVerdictCLI AlternativeReason
GitHub MCPUse CLIghgh covers the full API surface; model knows it from training; official GitHub MCP was archived
GitLab MCPUse CLIglabOfficial CLI is richer than the MCP wrapper; practitioner consensus confirms
Git MCP (Anthropic)Use CLIgitGit is the CLI the model knows best; MCP schema adds cost without structural benefit on frontier models
Filesystem MCPUse CLIcat, ls, findShell commands are universal; no benefit from schema overhead
Docker MCPUse CLIdockerDocker CLI is universally known; no widely adopted MCP adds comparable value
AWS MCPUse CLIaws-cliaws-cli v2 covers the full surface; model drives it natively from training knowledge
Terraform MCPUse CLIterraformDeterministic plan/apply workflow; CLI output is structured and auditable
Semgrep MCPUse CLIsemgrepMature CLI, well-documented; MCP adds value mainly in CI/CD observability contexts
Playwright MCPDependsplaywright-cli + skillFrontier model: CLI + skill is faster. Smaller model: MCP structures browser interaction reliably
Kubernetes MCPDependskubectlAuth complexity and multi-cluster setups favor MCP; simple operations favor kubectl
Vercel MCPDependsvercel CLICLI for deploy, env, and domains; MCP for dashboard integration and team workflow comments
Sentry MCPUse MCPsentry-cli (CI/CD scoped)sentry-cli handles releases, source map uploads, and CI/CD ops, but has no equivalent for interactive issue querying. MCP provides structured error triage for coding agents.
Slack MCPUse MCPnoneOAuth required; no practical CLI for workspace access from an agent
Notion MCPUse MCPnoneOAuth required; API-key access is limited to integrations, not user-scoped workspace access
Google Drive MCPUse MCPnoneOAuth 2.1 with refresh token rotation; cannot be replicated by a skill or CLI
Figma MCPUse MCPnoneOAuth required; design file access has no CLI equivalent
Linear MCPUse MCPnoneMCP handles GraphQL complexity; structured project management without raw API calls
Context7 MCPUse MCPnoneNo CLI equivalent for curated, version-specific doc retrieval

The pattern: if the service has a mature CLI the model knows from training, use the CLI. If the service requires OAuth or has no CLI, use MCP. “Depends” means the decision hinges on model capability or specific workflow needs.

Take the interactive quiz (6 questions, under 1 minute): cc.bruniaux.com/mcp-or-cli/


Most production workflows don’t choose one. They use both, with each covering the layer it handles best.

A practical example (from practitioners):

  • Inner layer (local dev iteration, git, file ops, shell scripts) → CLI, fast, deterministic, no overhead
  • Outer layer (CI/CD, shared infrastructure, cross-team services) → MCP Remote, observable, centralized, scalable
  • Skill layer (user-facing actions, CLI tools encapsulated for non-tech users) → CLIs wrapped in skills, transparent to the end user

The mistake is applying one answer to both layers. A solo developer building a Claude Code workflow for themselves should mostly use CLIs. A team deploying an AI assistant to non-technical colleagues should mostly use MCP.


Token cost of MCP schemas — what the numbers look like

Section titled “Token cost of MCP schemas — what the numbers look like”

Since v2.1.7 (January 2026), Claude Code uses MCP Tool Search (lazy loading) by default. This changes the token math significantly, but does not eliminate schema cost entirely.

How lazy loading works: instead of injecting all tool schemas at session start, Claude receives only tool names in an <available-deferred-tools> block. Full schemas are fetched via ToolSearch only when Claude decides to call a specific tool. Unused tools in a session cost only their name in context (~0 schema tokens), not the full definition.

Measured impact (Anthropic benchmarks, 5-server setup):

ScenarioToken overheadNote
Before v2.1.7 (eager loading)~55,000 tokensAll schemas preloaded
After v2.1.7 (lazy loading)~8,700 tokens85% reduction
CLI (no MCP)~0 tokensBaseline

The old worst-case claim of “500-2,000 tokens per server” described eager loading, which is no longer the default. With lazy loading, the cost per unused server is near zero. The cost per used server (~600 tokens per tool schema loaded on demand) remains real, but is now pay-per-use rather than always-on.

What still adds overhead even with lazy loading:

  • Tool names are still injected at startup (one line per tool per server)
  • Schemas load at first invocation — long sessions using many tools accumulate cost
  • Connection setup per server is unchanged (latency, not tokens)
  • Many connected MCP servers still means more names in context, even if schemas stay deferred

Configuration (v2.1.9+): the ENABLE_TOOL_SEARCH environment variable controls the threshold. auto:N triggers lazy loading when MCP tools exceed N% of context (default 10%).

Mitigation strategies (still relevant, lower urgency):

  • Load MCP servers selectively per project (project-level config vs global config)
  • Use CLI tools for high-frequency tight loops where any overhead compounds (compile → test → fix)
  • Monitor token usage per session to identify which schemas are being loaded at invocation time
  • Consider a CLI wrapper for tools you use constantly but don’t need structured output from

ToolWhat it doesStatus
RTK (Rust Token Killer)Filters CLI output before it reaches Claude’s context — reduces response verbosity, not schema overheadProduction-ready, actively maintained
MCPorter (steipete)TypeScript runtime for calling MCP servers from scripts, generating CLI wrappers, and emitting typed TS clients. Useful for testing MCP servers and writing hooks that need MCP access.3K stars, MIT, 2+ weeks, ready to use
mcp2cli (knowsuchagency)Converts MCP/OpenAPI/GraphQL to runtime CLI, eliminating schema injection. Benchmarked at 32× token reduction on the 43-tool GitHub MCP server (44K → 1.4K tokens).~1.9K stars, Show HN Best of March 2026 — production-viable for remote MCP servers with 10+ tools. See full breakdown.

Note on mcp2cli: the token savings are real for direct API use, remote MCP servers, and CI/CD pipelines — benchmarked independently by Firecrawl, Scalekit, and CircleCI. For standard Claude Code sessions where lazy loading (v2.1.7+) already defers most schemas, the gain is smaller. mcp2cli applies most clearly when you drive MCP tools from scripts or agents that don’t have deferred loading built in.


Skills (.claude/skills/*.md) are a third integration paradigm — distinct from both MCP servers and CLI tools. Conflating them with CLIs is the most common framing error in this space.

What each layer does:

  • Skills encode how the agent should behave — step-by-step workflows, decision trees, and SOPs written in markdown. They are loaded on demand into the agent’s context and guide its reasoning without injecting external tool schemas.
  • MCP servers provide structured access to external systems — APIs, databases, file systems — with typed tool interfaces the agent calls directly.
  • CLI tools provide command-line access to external systems — the agent constructs shell commands and parses text output.

Skills and MCP address different layers, not the same problem. A skill can describe when and how to invoke an MCP tool (check this field, then call that tool) while the MCP server handles the actual connection. Asking “should I write a skill or an MCP server?” usually means the layers are being conflated.

This is MCP’s clearest structural advantage over skills, and it’s not a matter of convenience.

A skill can instruct an agent to “authenticate with Google Drive before proceeding.” What it cannot do is hold a refresh token, complete a browser redirect, or manage a PKCE exchange. Those operations require server-side state, which a markdown file does not have.

Enterprise SaaS APIs — Google Workspace, Salesforce, Slack, GitHub — require OAuth 2.1 with refresh token rotation. When that is the authentication mechanism, MCP is not just more convenient: it is the only option that works without asking the user to paste credentials manually at the start of every session.

Practical test: if the service authenticates via an API key in a header, a skill or CLI can handle it. If it requires a browser redirect or server-held refresh token, that belongs in MCP.

The debate has largely settled on a three-layer model rather than a binary choice:

  • Skills handle what to do and when — workflow orchestration, decision guidance, reproducible agent behavior
  • MCP handles connectivity and auth — external systems that require structured interfaces, OAuth, or enterprise observability
  • CLI handles deterministic local operations — git, file ops, test runners, anything the model can drive directly from training knowledge

The convergence is now part of the spec: SEP-2640 (“Skills Over MCP”) proposes distributing skills as MCP resources, so users install workflows the same way they install tool servers. The two paradigms are being unified rather than forced to compete.


A few representative perspectives from experienced Claude Code users:

“I prefer CLI for deterministic actions. For GitLab interactions I use glab (the GitLab MCP is too limited) wrapped in a custom CLI — usable by both humans and AI.” — practitioner

“On Claude Code with frontier models, fewer MCPs is better. I replaced playwright-mcp with playwright-cli + skill — faster and more effective. I still use context7-mcp only because I haven’t found a CLI equivalent.” — practitioner

“The CLI vs MCP debate is only happening among devs doing dev things. But there’s one fundamental constraint: you cannot propose a CLI solution to a non-technical user who just wants to use their tool simply.” — practitioner

“For enterprise industrialization, observability is non-negotiable. CLI on a local machine is a black box. MCP Remote gives you the logging that C-levels need to attribute investment.” — practitioner

“Frontier models are strong enough to drive a CLI directly. A weaker local model will struggle — that’s where MCP schemas earn their overhead.” — practitioner


Back to MCP Servers Ecosystem | Third-Party Tools | Main guide