4. Agents
📌 Section 4 TL;DR (60 seconds)
Section titled “📌 Section 4 TL;DR (60 seconds)”What are Agents: Specialized AI personas for specific tasks (think “expert consultants”)
When to create one:
- ✅ Task repeats often (security reviews, API design)
- ✅ Requires specialized knowledge domain
- ✅ Needs consistent behavior/tone
- ❌ One-off tasks (just ask Claude directly)
Quick Start:
- Create
.claude/agents/my-agent.md - Add YAML frontmatter (name, description, tools, model)
- Write instructions
- Use:
@my-agent "task description"
Popular agent types: Security auditor, Test generator, Code reviewer, API designer
Read this section if: You have repeating tasks or need domain expertise Skip if: All your tasks are one-off exploratory work
Reading time: 20 minutes Skill level: Week 1-2 Goal: Create specialized AI assistants
4.1 What Are Agents
Section titled “4.1 What Are Agents”Agents are specialized sub-processes that Claude can delegate tasks to.
Why Use Agents?
Section titled “Why Use Agents?”| Without Agents | With Agents |
|---|---|
| One Claude doing everything | Specialized experts for each domain |
| Context gets cluttered | Each agent has focused context |
| Generic responses | Domain-specific expertise |
| Manual tool selection | Pre-configured tool access |
Agent vs Direct Prompt
Section titled “Agent vs Direct Prompt”Direct Prompt:You: Review this code for security issues, focusing on OWASP Top 10, checking for SQL injection, XSS, CSRF, and authentication vulnerabilities...
With Agent:You: Use the security-reviewer agent to audit this codeThe agent encapsulates all that expertise.
Built-in vs Custom Agents
Section titled “Built-in vs Custom Agents”| Type | Source | Example |
|---|---|---|
| Built-in | Claude Code default | Explore, Plan |
| Custom | Your .claude/agents/ | Backend architect, Code reviewer |
4.2 Creating Custom Agents
Section titled “4.2 Creating Custom Agents”Agents are markdown files in .claude/agents/ with YAML frontmatter.
Agent File Structure
Section titled “Agent File Structure”---name: agent-namedescription: Clear activation trigger (50-100 chars)model: sonnettools: Read, Write, Edit, Bash, Grep, Glob---
[Markdown instructions for the agent]Frontmatter Fields
Section titled “Frontmatter Fields”All official fields supported by Claude Code (source):
| Field | Required | Description |
|---|---|---|
name | ✅ | Kebab-case identifier |
description | ✅ | When to activate this agent (use “PROACTIVELY” for auto-invocation) |
model | ❌ | sonnet (default), opus, haiku, or inherit |
tools | ❌ | Allowed tools (comma-separated). Supports Task(agent_type) syntax to restrict spawnable subagents |
disallowedTools | ❌ | Tools to deny, removed from inherited or specified list |
permissionMode | ❌ | default, acceptEdits, dontAsk, bypassPermissions, or plan |
maxTurns | ❌ | Maximum agentic turns before the subagent stops |
skills | ❌ | Skills to preload into agent context at startup (full content injected, not just available) |
mcpServers | ❌ | MCP servers for this subagent — server name strings or inline configs |
hooks | ❌ | Lifecycle hooks scoped to this subagent (PreToolUse, PostToolUse, Stop) |
memory | ❌ | Persistent memory scope: user, project, or local |
background | ❌ | true to always run as a background task (default: false) |
isolation | ❌ | worktree to run in a temporary git worktree (auto-cleaned if no changes) |
color | ❌ | CLI output color for visual distinction (e.g., green, magenta) |
Memory scopes — choose based on how broadly the knowledge should apply:
| Scope | Storage | Use when |
|---|---|---|
user | ~/.claude/agent-memory/<name>/ | Cross-project learning |
project | .claude/agent-memory/<name>/ | Project-specific, shareable via git |
local | .claude/agent-memory-local/<name>/ | Project-specific, not committed |
Model Selection
Section titled “Model Selection”| Model | Best For | Speed | Cost |
|---|---|---|---|
haiku | Quick tasks, simple changes | Fast | Low |
sonnet | Most tasks (default) | Balanced | Medium |
opus | Complex reasoning, architecture | Slow | High |
4.3 Agent Template
Section titled “4.3 Agent Template”Copy this template to create your own agent:
---name: your-agent-namedescription: Use this agent when [specific trigger description]model: sonnettools: Read, Write, Edit, Bash, Grep, Globskills: []---
# Your Agent Name
## Role Definition
You are an expert in [domain]. Your responsibilities include:- [Responsibility 1]- [Responsibility 2]- [Responsibility 3]
## Activation Triggers
Use this agent when:- [Trigger 1]- [Trigger 2]- [Trigger 3]
## Methodology
When given a task, you should:1. [Step 1]2. [Step 2]3. [Step 3]4. [Step 4]
## Output Format
Your deliverables should include:- [Output 1]- [Output 2]
## Constraints
- [Constraint 1]- [Constraint 2]
## Examples
### Example 1: [Scenario Name]
**User**: [Example prompt]
**Your approach**:1. [What you do first]2. [What you do next]3. [Final output]4.4 Best Practices
Section titled “4.4 Best Practices”Do’s and Don’ts
Section titled “Do’s and Don’ts”| ✅ Do | ❌ Don’t |
|---|---|
| Make agents specialists | Create generalist agents |
| Define clear triggers | Use vague descriptions |
| Include concrete examples | Leave activation ambiguous |
| Limit tool access | Give all tools to all agents |
| Compose via skills | Duplicate expertise |
Specialization Over Generalization
Section titled “Specialization Over Generalization”Good: An agent for each concern
backend-architect → API design, database, performancesecurity-reviewer → OWASP, auth, encryptiontest-engineer → Test strategy, coverage, TDDBad: One agent for everything
full-stack-expert → Does everything (poorly)Explicit Activation Triggers
Section titled “Explicit Activation Triggers”Good description:
description: Use when designing APIs, reviewing database schemas, or optimizing backend performanceBad description:
description: Backend stuffSkill Composition
Section titled “Skill Composition”Instead of duplicating knowledge:
# security-reviewer.mdskills: - security-guardian # Inherits OWASP knowledgeAgent Validation Checklist
Section titled “Agent Validation Checklist”Before deploying a custom agent, validate against these criteria:
Efficacy (Does it work?)
- Tested on 3+ real use cases from your project
- Output matches expected format consistently
- Handles edge cases gracefully (empty input, errors, timeouts)
- Integrates correctly with existing workflows
Efficiency (Is it cost-effective?)
- <5000 tokens per typical execution
- <30 seconds for standard tasks
- Doesn’t duplicate work done by other agents/skills
- Justifies its existence vs. native Claude capabilities
Security (Is it safe?)
- Tools restricted to minimum necessary
- No Bash access unless absolutely required
- File access limited to relevant directories
- No credentials or secrets in agent definition
Maintainability (Will it last?)
- Clear, descriptive name and description
- Explicit activation triggers documented
- Examples show common usage patterns
- Version compatibility noted if framework-dependent
💡 Rule of Three: If an agent doesn’t save significant time on at least 3 recurring tasks, it’s probably over-engineering. Start with skills, graduate to agents only when complexity demands it.
Automated audit: Run
/audit-agents-skillsfor a comprehensive quality audit across all agents, skills, and commands. Scores each file on 16 criteria with weighted grading (32 points for agents/skills, 20 for commands). Seeexamples/skills/audit-agents-skills/for the full scoring methodology.
Background Subagents
Section titled “Background Subagents”Subagents can run in the background without blocking the main session. This is useful for fire-and-forget tasks like running tests, linting, or notifications.
| Mode | Behavior | Use when |
|---|---|---|
| Default | Parent waits for agent output | Need result before continuing |
| Background | Agent runs in parallel, parent continues | Fire-and-forget (tests, linting, notifications) |
Managing background agents:
# List running agents + kill overlayctrl+f # Opens agent manager overlay
# Cancel main thread only (background agents keep running)ESCctrl+c4.5 Agent Examples
Section titled “4.5 Agent Examples”Example 1: Code Reviewer Agent
Section titled “Example 1: Code Reviewer Agent”---name: code-reviewerdescription: Use for code quality reviews, security audits, and performance analysismodel: sonnettools: Read, Grep, Globskills: - security-guardian---
# Code Reviewer
## Scope Definition
Perform comprehensive code reviews with isolated context, focusing on:- Code quality and maintainability- Security best practices (OWASP Top 10)- Performance optimization- Test coverage analysis
Scope: Code review analysis only. Provide findings without implementing fixes.
## Activation Triggers
Use this agent when:- Completing a feature before PR (need fresh eyes on code)- Reviewing someone else's code (isolated review context)- Auditing security-sensitive code (security-focused scope)- Analyzing performance bottlenecks (performance-focused scope)
## Methodology
1. **Understand Context**: Read the code and understand its purpose2. **Check Quality**: Evaluate readability, maintainability, DRY principles3. **Security Scan**: Look for OWASP Top 10 vulnerabilities4. **Performance Review**: Identify potential bottlenecks5. **Provide Feedback**: Structured report with severity levels
## Output Format
### Code Review Report
**Summary**: [1-2 sentence overview]
**Critical Issues** (Must Fix):- [Issue with file:line reference]
**Warnings** (Should Fix):- [Issue with file:line reference]
**Suggestions** (Nice to Have):- [Improvement opportunity]
**Positive Notes**:- [What was done well]Example 2: Debugger Agent
Section titled “Example 2: Debugger Agent”---name: debuggerdescription: Use when encountering errors, test failures, or unexpected behaviormodel: sonnettools: Read, Bash, Grep, Glob---
# Debugger
## Scope Definition
Perform systematic debugging with isolated context:- Investigate root causes, not symptoms- Use evidence-based debugging approach- Verify rather than assume (always review output—LLMs can make mistakes)
Scope: Debugging analysis only. Focus on root cause identification without context pollution from previous debugging attempts.
## Methodology
1. **Reproduce**: Confirm the issue exists2. **Isolate**: Narrow down to smallest reproducible case3. **Analyze**: Read code, check logs, trace execution4. **Hypothesize**: Form theories about the cause5. **Test**: Verify hypothesis with minimal changes6. **Fix**: Implement the solution7. **Verify**: Confirm fix works and doesn't break other things
## Output Format
### Debug Report
**Issue**: [Description]**Root Cause**: [What's actually wrong]**Evidence**: [How you know]**Fix**: [What to change]**Verification**: [How to confirm it works]Example 3: Backend Architect Agent
Section titled “Example 3: Backend Architect Agent”---name: backend-architectdescription: Use for API design, database optimization, and system architecture decisionsmodel: opustools: Read, Write, Edit, Bash, Grepskills: - backend-patterns---
# Backend Architect
## Scope Definition
Analyze backend architecture with isolated context, focusing on:- API design (REST, GraphQL, tRPC)- Database modeling and optimization- System scalability- Clean architecture patterns
Scope: Backend architecture analysis only. Focus on design decisions without frontend or DevOps considerations.
## Activation Triggers
Use this agent when:- Designing new API endpoints (need architecture-focused analysis)- Optimizing database queries (database scope isolation)- Planning system architecture (system design scope)- Refactoring backend code (backend-only scope)
## Methodology
1. **Requirements Analysis**: Understand the business need2. **Architecture Review**: Check current system state3. **Design Options**: Propose 2-3 approaches with trade-offs4. **Recommendation**: Suggest best approach with rationale5. **Implementation Plan**: Break down into actionable steps
## Constraints
- Follow existing project patterns- Prioritize backward compatibility- Consider performance implications- Document architectural decisions4.6 Advanced Agent Patterns
Section titled “4.6 Advanced Agent Patterns”Tool SEO - Optimizing Agent Descriptions
Section titled “Tool SEO - Optimizing Agent Descriptions”The description field determines when Claude auto-activates your agent. Optimize it like SEO:
# ❌ Bad descriptiondescription: Reviews code
# ✅ Good description (Tool SEO)description: | Security code reviewer - use PROACTIVELY when: - Reviewing authentication/authorization code - Analyzing API endpoints - Checking input validation - Auditing data handling Triggers: security, auth, vulnerability, OWASP, injectionTool SEO Techniques:
- “use PROACTIVELY”: Encourages automatic activation
- Explicit triggers: Keywords that trigger the agent
- Listed contexts: When the agent is relevant
- Short nicknames:
sec-1,perf-a,doc-gen
Agent Weight Classification
Section titled “Agent Weight Classification”| Category | Tokens | Init Time | Optimal Use |
|---|---|---|---|
| Lightweight | <3K | <1s | Frequent tasks, workers |
| Medium | 10-15K | 2-3s | Analysis, reviews |
| Heavy | 25K+ | 5-10s | Architecture, full audits |
Golden Rule: A lightweight agent used 100x > A heavy agent used 10x
The 7-Parallel-Task Method
Section titled “The 7-Parallel-Task Method”Launch 7 scope-focused sub-agents in parallel for complete features:
┌─────────────────────────────────────────────────────────────┐│ PARALLEL FEATURE IMPLEMENTATION ││ ││ Task 1: Components → Create React components ││ Task 2: Styles → Generate Tailwind styles ││ Task 3: Tests → Write unit tests ││ Task 4: Types → Define TypeScript types ││ Task 5: Hooks → Create custom hooks ││ Task 6: Integration → Connect with API/state ││ Task 7: Config → Update configurations ││ ││ All in parallel → Final consolidation │└─────────────────────────────────────────────────────────────┘Example Prompt:
Implement the "User Profile" feature using 7 parallel sub-agents:
1. COMPONENTS: Create UserProfile.tsx, UserAvatar.tsx, UserStats.tsx2. STYLES: Define Tailwind classes in a styles file3. TESTS: Write tests for each component4. TYPES: Create types in types/user-profile.ts5. HOOKS: Create useUserProfile and useUserStats hooks6. INTEGRATION: Connect with existing tRPC router7. CONFIG: Update exports and routing
Launch all agents in parallel.Split Role Sub-Agents
Section titled “Split Role Sub-Agents”Concept: Multi-perspective analysis in parallel.
Process:
┌─────────────────────────────────────────────────────────────┐│ SPLIT ROLE ANALYSIS ││ ││ Step 1: Setup ││ └─ Activate Plan Mode (thinking enabled by default) ││ ││ Step 2: Role Suggestion ││ └─ "What expert roles would analyze this code?" ││ Claude suggests: Security, Performance, UX, etc. ││ ││ Step 3: Selection ││ └─ "Use: Security Expert, Senior Dev, Code Reviewer" ││ ││ Step 4: Parallel Analysis ││ ├─ Security Agent: [Vulnerability analysis] ││ ├─ Senior Agent: [Architecture analysis] ││ └─ Reviewer Agent: [Readability analysis] ││ ││ Step 5: Consolidation ││ └─ Synthesize 3 reports into recommendations │└─────────────────────────────────────────────────────────────┘Code Review Prompt (scope-focused):
Analyze this PR with isolated scopes:1. Architecture Scope: Design patterns, SOLID principles, modularity2. Security Scope: Vulnerabilities, injection risks, auth/authz flaws3. Performance Scope: Database queries, algorithmic complexity, caching4. Maintainability Scope: Code clarity, documentation, naming conventions5. Testing Scope: Test coverage, edge cases, testability
Context: src/**, tests/**, only files changed in PRUX Review Prompt (scope-focused):
Evaluate this interface with isolated scopes:1. Visual Design Scope: Consistency with design system, spacing, typography2. Usability Scope: Discoverability, user flow, cognitive load3. Efficiency Scope: Keyboard shortcuts, power user features, quick actions4. Accessibility Scope: WCAG 2.1 AA compliance, screen reader, keyboard nav5. Responsive Scope: Mobile breakpoints, touch targets, viewport handling
Context: src/components/**, styles/**, only UI-related filesProduction Example: Multi-Agent Code Review (Pat Cullen, Jan 2026):
Scope-focused agents for comprehensive PR review:
- Consistency Scope: Duplicate logic, pattern violations, DRY compliance (context: full PR diff)
- SOLID Scope: SRP violations, nested conditionals (>3 levels), cyclomatic complexity >10 (context: changed classes/functions)
- Defensive Code Scope: Silent catches, swallowed exceptions, hidden fallbacks (context: error handling code)
Key patterns (beyond generic Split Role):
- Pre-flight check:
git log --oneline -10 | grep "Co-Authored-By: Claude"to detect follow-up passes and avoid repeating suggestions - Anti-hallucination: Use
Grep/Globto verify patterns before recommending them (occurrence rule: >10 = established, <3 = not established) - Reconciliation: Prioritize existing project patterns over ideal patterns, skip suggestions with documented reasoning
- Severity classification: 🔴 Must Fix (blockers) / 🟡 Should Fix (improvements) / 🟢 Can Skip (nice-to-haves)
- Convergence loop: Review → fix → re-review → repeat (max 3 iterations) until only optional improvements remain
Production safeguards:
- Read full file context (not just diff lines)
- Conditional context loading based on diff content (DB queries → check indexes, API routes → check auth middleware)
- Protected files skip list (package.json, migrations, .env)
- Quality gates:
tsc && lintvalidation before each iteration
Source: Pat Cullen’s Final Review
Implementation: See /review-pr advanced section, examples/agents/code-reviewer.md, guide/workflows/iterative-refinement.md (Review Auto-Correction Loop)
Named Perspective Agents
Section titled “Named Perspective Agents”The guide lists “roleplaying expertise personas” as a bad reason to use agents (see §3.x, When NOT to use agents). Named Perspective Agents are a different pattern and should not be confused with it.
The distinction:
| Pattern | What it is | Problem |
|---|---|---|
| Persona roleplay (anti-pattern) | “You are a senior backend developer with 10 years of experience” | Generic role, adds nothing over a good prompt |
| Named Perspective | ”Review from DHH’s perspective” | Encodes a specific, recognizable set of engineering opinions |
A Named Perspective Agent uses a well-known engineering name as a compressed prompt. Naming an agent “DHH” bundles the following without spelling it out: fat models, thin controllers, REST conventions over configuration, skepticism of premature abstraction, Rails pragmatism. The name is a shortcut to a distinct opinionated style, not a costume.
When it works: Only for engineers whose views Claude has been trained on and whose opinions map to a stable, recognizable style. DHH (Rails), Kent Beck (TDD, simplicity), Martin Fowler (refactoring, patterns) are good candidates. Random names are not.
Example (from Every.to compound-engineering plugin):
---name: dhh-reviewerdescription: Review code from DHH's perspective. Prioritize Rails conventions, fat models, thin controllers, pragmatic REST, and skepticism of unnecessary abstraction.allowed-tools: Read, Grep---The agent’s value is in surfacing a coherent perspective that might disagree with your default approach, not in simulating a person.
Caveat: Named Perspective Agents can drift as Claude’s training evolves. Treat the name as a convenient shorthand, not a guarantee that the agent will track a real person’s current opinions.
Source: Every.to compound-engineering plugin (2026)
Parallelization Decision Matrix
Section titled “Parallelization Decision Matrix”┌─────────────────────────────────────────────────────────────┐│ PARALLELIZABLE? ││ ││ Non-destructive Destructive ││ (read-only) (write) ││ ││ Independent ✅ PARALLEL ⚠️ SEQUENTIAL ││ Max efficiency Plan Mode first ││ ││ Dependent ⚠️ SEQUENTIAL ❌ CAREFUL ││ Order matters Risk of conflicts ││ │└─────────────────────────────────────────────────────────────┘✅ Perfectly parallelizable:
"Search 8 different GitHub repos for best practices on X""Analyze these 5 files for vulnerabilities (without modifying)""Compare 4 libraries and produce a comparative report"⚠️ Sequential recommended:
"Refactor these 3 files (they depend on each other)""Migrate DB schema then update models then update routers"❌ Needs extra care:
"Modify these 10 files in parallel"→ Risk: conflicts if files share imports/exports→ Solution: Plan Mode → Identify dependencies → Sequence if neededMulti-Agent Orchestration Pattern
Section titled “Multi-Agent Orchestration Pattern”┌─────────────────────────────────────────────────────────────┐│ ORCHESTRATION PATTERN ││ ││ ┌──────────────┐ ││ │ Sonnet 4.5 │ ││ │ Orchestrator │ ││ └──────┬───────┘ ││ │ ││ ┌────────────┼────────────┐ ││ │ │ │ ││ ▼ ▼ ▼ ││ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││ │ Haiku │ │ Haiku │ │ Haiku │ ││ │ Worker1 │ │ Worker2 │ │ Worker3 │ ││ └────┬────┘ └────┬────┘ └────┬────┘ ││ │ │ │ ││ └────────────┼────────────┘ ││ │ ││ ▼ ││ ┌──────────────┐ ││ │ Sonnet 4.5 │ ││ │ Validator │ ││ └──────────────┘ ││ ││ Cost: 2-2.5x cheaper than Opus everywhere ││ Quality: Equivalent for most common tasks │└─────────────────────────────────────────────────────────────┘Tactical Model Selection Matrix
Section titled “Tactical Model Selection Matrix”See Section 2.5 Model Selection & Thinking Guide for the canonical decision table with effort levels and cost estimates.
Cost Optimization Example:
Scenario: Refactoring 100 files
❌ Naive approach:- Opus for everything- Cost: ~$50-100- Time: 2-3h
✅ Optimized approach:- Sonnet: Analysis and plan (1x)- Haiku: Parallel workers (100x)- Sonnet: Final validation (1x)- Cost: ~$5-15- Time: 1h (parallelized)
Estimated savings: significant (varies by project)The Self-Evolving Agent Pattern
Section titled “The Self-Evolving Agent Pattern”An agent that updates its own skills after each execution. Instead of manually maintaining documentation, the agent reads the current state of its domain and rewrites the knowledge injected into itself.
When to use: Long-lived agents whose domain evolves — presentation editors, API clients tracking schema changes, agents managing living documents.
Core mechanism (in agent system prompt):
### Step N: Self-Evolution (after every execution)
After completing your main task, update your preloaded skills to stay in sync:
1. Read the current state of [the domain you modified]2. Update `.claude/skills/<your-skill>/SKILL.md` to reflect reality3. Log what changed and why in a "## Learnings" section of this agent file
This prevents knowledge drift between what you know and what is.Full example — a presentation curator agent that keeps its own layout/weight knowledge fresh:
---name: presentation-curatordescription: PROACTIVELY use when updating slides, structure, or weightstools: Read, Write, Edit, Grep, Globmodel: sonnetcolor: magentaskills: - presentation/slide-structure - presentation/styling---
## Step 5: Self-Evolution (after every execution)
Read presentation/index.html and update your skills:- slide-structure skill: update section ranges, weight table, slide count- styling skill: update CSS patterns if new ones were introduced- Append new findings to the "## Learnings" section below
## Learnings_Each run appends findings here. Future invocations start informed._- Slide badges are JS-injected — never hardcode them in HTML.Why it works: The skills: frontmatter injects skill content at agent startup. By writing back to those files after each run, the agent’s next invocation starts with current knowledge. No human maintenance required.
Key constraints:
- Scope updates narrowly — only update what actually changed
- Keep a
## Learningslog so the agent builds cumulative knowledge over sessions - Pair with
memory: projectfor cross-session persistence of broader context
5. Skills
Section titled “5. Skills”Quick jump: Two Kinds of Skills · Understanding Skills · Creating Skills · Skill Lifecycle · Skill Evals · Skill Template · Skill Examples
Note (January 2026): Skills and Commands are being unified. Both now use the same invocation mechanism (
/skill-nameor/command-name), share YAML frontmatter syntax, and can be triggered identically. The conceptual distinction (skills = knowledge modules, commands = workflow templates) remains useful for organization, but technically they’re converging. Create new ones based on purpose, not mechanism.
Reading time: 20 minutes Skill level: Week 2 Goal: Create, test, and manage reusable knowledge modules