Agent Teams Quick Start Guide
Agent Teams Quick Start Guide
Section titled “Agent Teams Quick Start Guide”Practical guide for using agent teams in your projects Reading time: 8-10 min | Full documentation: Agent Teams (30 min overview)
What is This?
Section titled “What is This?”You know agent teams exist. You’ve read the theory. But when should you actually use them in your projects?
This guide gives you:
- ✅ 5-minute setup (environment → first test)
- ✅ 4 copy-paste patterns for real projects (Guide + RTK)
- ✅ Decision matrix (when YES, when NO)
- ✅ Metrics to measure ROI
- ✅ Red flags to avoid waste
Skip if: You want theory → Read Agent Teams full doc instead.
Table of Contents
Section titled “Table of Contents”- 5-Minute Setup
- Patterns for Your Projects
- Decision Matrix: When to Use
- Minimal Workflow Template
- Success Metrics
- Limitations & Red Flags
1. 5-Minute Setup
Section titled “1. 5-Minute Setup”Step 1: Prerequisites Check
Section titled “Step 1: Prerequisites Check”# Check Claude Code version (v2.1.32+ required)claude --version
# Check model availabilityclaude> /model opus# Should show: "Model changed to opus (claude-opus-4-6-20250624)"Minimum requirements:
- Claude Code v2.1.32+
- Opus 4.6 model
- Git repository (agent teams use git for coordination)
Step 2: Enable Feature
Section titled “Step 2: Enable Feature”# Set environment variable (add to ~/.bashrc or ~/.zshrc for persistence)export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
# Launch Claude CodeclaudeStep 3: Verification
Section titled “Step 3: Verification”> Are agent teams enabled?Expected response:
Yes, agent teams are enabled in this session. I can create teamsof agents to work in parallel on complex tasks using:- Multi-agent coordination- Git-based task claiming- Autonomous team coordinationIf disabled: Check environment variable is set (echo $CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS).
Step 4: First Test (2 agents)
Section titled “Step 4: First Test (2 agents)”> Create a simple test team to analyze this README:> - Agent 1: Check structure (sections, TOC, badges)> - Agent 2: Check content quality (clarity, examples, completeness)What happens:
- Claude spawns 2 agents (you’ll see “Creating team…” message)
- Agents work in parallel (you may see “Idle” messages - this is normal)
- Claude presents consolidated findings (convergence + unique insights)
Navigation:
Shift+Down: Cycle through teammate outputs (in-process mode)- Main view: Consolidated synthesis
Duration: 1-2 min for simple 2-agent task.
2. Patterns for Your Projects
Section titled “2. Patterns for Your Projects”2.1 Claude Code Guide - Pre-Release Review
Section titled “2.1 Claude Code Guide - Pre-Release Review”Use case: Systematic audit before version bump to catch consistency issues (broken links, desync counts, wrong versions)
Trigger: Before every /release command execution
Duration: 3-5 min
Team composition:
Team: pre-release-audit (3 agents)├─ accuracy-auditor → Verify claims, stats, versions, external links├─ consistency-checker → Check template count, eval count, guide lines match across files└─ breaking-checker → Identify breaking changes vs last releaseCopy-paste prompt:
> Create a pre-release audit team:> - Accuracy: Check all claims, stats, version numbers, external links in CHANGELOG.md, README.md, and guide/ultimate-guide.md> - Consistency: Verify template count (count examples/ files), eval count (count docs/resource-evaluations/ files), guide lines (wc -l guide/ultimate-guide.md) match across README.md, guide/cheatsheet.md, machine-readable/reference.yaml> - Breaking: Identify breaking changes vs v3.23.0 by analyzing CHANGELOG.md [Unreleased] sectionROI: Catches bugs like:
- LinkedIn URL corruption (caught in real audit today)
- Desync between guide/landing (template counts)
- Wrong version numbers in multiple files
- Broken external links
When to skip: Simple typo fixes (no version/count/link changes).
Real example (test from 2026-02-08):
Findings:- Accuracy: 1 critical (LinkedIn URL malformed), 3 warnings (external links to verify)- Consistency: 2 criticals (template count mismatch README vs landing, line numbers outdated in reference.yaml)- Breaking: 0 breaking changes detected vs v3.23.0 (clean release)
Convergence: 2 agents flagged template count issue (high confidence)Time: 4 min 12 secVerdict: ✅ High value, found 3 criticals that would've shipped2.2 Claude Code Guide - Landing Sync
Section titled “2.2 Claude Code Guide - Landing Sync”Use case: Verify guide/landing synchronization (version, counts, content) without running manual script
Trigger: After significant guide modifications (version bump, template additions, FAQ changes)
Duration: 2-3 min
Team composition:
Team: landing-sync (2 agents)├─ guide-scanner → Extract version, template count, eval count, guide lines, FAQ content└─ landing-scanner → Compare with index.html, examples.html, check sync statusCopy-paste prompt:
> Validate guide/landing synchronization:> - Guide scanner: Extract version from VERSION file, template count (find examples/ -type f | wc -l), eval count (find docs/resource-evaluations/ -name "*.md" | wc -l), guide lines (wc -l guide/ultimate-guide.md), FAQ entries from README.md> - Landing scanner: Check /Users/florianbruniaux/Sites/perso/claude-code-ultimate-guide-landing/index.html and examples.html for version in footer+FAQ, template count in badges, eval count, guide lines approximation (~9800+)>> Report: Synced ✅ / Mismatches with line numbersROI:
- Zero desync shipped to production
- Avoids manual
./scripts/check-landing-sync.shexecution - Faster than script (2 min vs 5 min to run script + fix)
When to skip: Pure code changes (no docs/counts affected).
Success criteria:
✅ All synced: Version, template count, eval count match⚠️ Mismatch: Specific line numbers in index.html to fix2.3 Claude Code Guide - Multi-File Doc Update
Section titled “2.3 Claude Code Guide - Multi-File Doc Update”Use case: Add new feature documentation across multiple files (ultimate-guide.md, reference.yaml, README.md) with cross-reference consistency
Trigger: New feature section (>50 lines), breaking changes, architecture updates
Duration: 5-8 min
Team composition:
Team: doc-update (3 agents)├─ content-writer → Write main section in ultimate-guide.md with examples├─ index-updater → Update TOC, reference.yaml entries, README navigation└─ consistency-checker → Verify cross-references, line numbers, anchors workCopy-paste prompt:
> Update documentation for new "[FEATURE NAME]" feature:> - Content Scope: Write section [X.Y] in guide/ultimate-guide.md with:> - Overview (what/why/when)> - 2-3 concrete examples> - Best practices + gotchas> - Links to related sections> Context: guide/ultimate-guide.md section [X.Y] only> - Index Scope: Update:> - guide/ultimate-guide.md TOC (add section [X.Y])> - machine-readable/reference.yaml (add entry with line numbers)> - README.md navigation (add link if major feature)> Context: Index files only (TOC, reference.yaml, README.md)> - Consistency Scope: Verify:> - All cross-references resolve correctly> - Line numbers in reference.yaml match actual content> - Anchors in README point to correct sections> - No broken internal links> Context: All modified files for cross-reference validationROI:
- Zero broken links (consistency-checker catches all)
- 60% faster than sequential (write → index → verify)
- Parallel work reduces waiting time
When to skip: Single-file edits (<50 lines), no cross-references needed.
Real example (Agent Teams section addition):
Task: Add "Agent Teams" as section 9.20 (300 lines)Files touched: 3 (ultimate-guide.md, reference.yaml, README.md)
Sequential estimate: 12-15 min (write → index → verify)Agent Teams: 7 min 30 sec (3 agents parallel)Savings: 40% time + zero manual cross-ref checks2.4 RTK - Security PR Review
Section titled “2.4 RTK - Security PR Review”Use case: Review external contributor PR for security issues (injection, token leaks), Rust idioms, and performance
Trigger: PR opened by non-core contributor, PR touches sensitive code (auth, external commands, regex)
Duration: 5-8 min
Team composition:
Team: security-pr-review (3 agents)├─ rust-expert → Check ownership patterns, error handling (anyhow/thiserror), idiomatic code├─ security-auditor → Scan for injection risks, token leaks, input sanitization└─ perf-analyzer → Review allocations, async patterns, compiled regexCopy-paste prompt:
> Review PR #[NUMBER] with scope-focused analysis:> - Rust Scope: Check:> - Ownership patterns (prefer &str over String, minimize clones)> - Error handling (anyhow::Result with .context(), no unwrap outside tests)> - Idiomatic code (impl after type, #[cfg(test)] mod tests)> - Clippy compliance (zero warnings)> Context: All modified .rs files> - Security Scope: Scan for:> - Command injection (shell escapes, argument sanitization)> - Token/credential leaks (hardcoded secrets, logs, error messages)> - Input sanitization (path traversal, regex DoS)> - File operations (path validation, permissions)> Context: Input handling, auth, file I/O code> - Performance Scope: Review:> - Unnecessary allocations (String::from vs &str)> - Async patterns (spawn_blocking for CPU-bound work)> - Compiled regex (lazy_static! for hot paths)> - Algorithm complexity (O(n) vs O(n²))> Context: Hot paths, loops, async functionsROI:
- Blind spots detection: Security + Rust + Perf = areas solo reviewer would miss
- Consistent review quality (not dependent on reviewer mood/focus)
- Faster than sequential (3 agents parallel vs 3-pass review)
When to skip: Internal PRs from trusted contributors, trivial changes (docs, comments, tests only).
Success criteria:
✅ Convergence: 2+ agents flag same critical issue (high confidence)✅ Unique insights: Each agent finds domain-specific issues (Rust/Security/Perf)❌ False positives: <20% of findings are invalidReal example (hypothetical external PR):
PR: Add new git filter commandAgents findings:- Rust: 3 issues (unwrap in production code, missing .context(), non-idiomatic error handling)- Security: 2 criticals (shell injection via user input, token leak in error message)- Perf: 1 issue (regex compiled on every call, not lazy_static)
Convergence: Security + Rust both flagged missing input sanitization (high confidence)Time: 6 min 40 secVerdict: ✅ Critical security issues caught, PR requires revision3. Decision Matrix: When to Use
Section titled “3. Decision Matrix: When to Use”| Situation | Agent Teams ? | Raison |
|---|---|---|
| Pre-release review (Guide) | ✅ YES | Multi-layer audit (accuracy + consistency + breaking) requires parallel perspectives |
| Simple typo fix | ❌ NO | Overkill, 1 agent = 10 sec, 3 agents = cost bloat |
| External PR (RTK) | ✅ YES | Security + Rust + Perf = blind spots detection, high-stakes review |
| Multi-file doc update (Guide) | ✅ YES | Content + Index + Consistency = zero broken links, parallel work |
| Landing sync check | ⚠️ MAYBE | Use agent teams if desync suspected, else ./scripts/check-landing-sync.sh faster |
| CHANGELOG update | ❌ NO | Sequential task (linear writing), no parallelization benefit |
| Single file edit (RTK) | ❌ NO | No coordination needed, sequential OK |
| Small README tweak (<50 lines) | ❌ NO | No cross-references, no complexity, single agent faster |
| Architecture design | ✅ YES | Multiple perspectives (frontend, backend, infra, security) reveal blind spots |
| Bug investigation | ⚠️ MAYBE | Simple bugs → NO, complex multi-component failures → YES |
Rule of Thumb
Section titled “Rule of Thumb”Use Agent Teams when:
- ✅ You’d naturally think “I should check X, Y, and Z”
- ✅ High stakes (production release, external contributor, security-sensitive)
- ✅ Multi-scope analysis needed (Rust scope + Security scope + Performance scope)
- ✅ Cross-file consistency matters (links, counts, versions sync)
- ✅ Parallel work possible (independent tasks, no sequential dependency)
Don’t use Agent Teams when:
- ❌ Simple task (<5 files, <100 lines, 1 domain)
- ❌ Sequential workflow (step B depends on step A result)
- ❌ Budget tight (3x tokens, reserve for high-value tasks)
- ❌ Write-heavy (many edits to same files = merge conflicts)
4. Minimal Workflow Template
Section titled “4. Minimal Workflow Template”Bash Template (reusable)
Section titled “Bash Template (reusable)”# 1. Setup (once per session)export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
# 2. Launch Claudeclaude
# 3. Create team (prompt template)> Create a team to [TASK]:> - Agent 1 ([SCOPE/CONTEXT]): [SPECIFIC MISSION]> - Agent 2 ([SCOPE/CONTEXT]): [SPECIFIC MISSION]> - Agent 3 ([SCOPE/CONTEXT]): [SPECIFIC MISSION]>> [FILES/DIRECTORIES TO ANALYZE]
# 4. Observe (optional)# Shift+Down to cycle through teammate outputs (in-process mode)
# 5. Synthesis# Claude presents consolidated findings automatically
# 6. Act# Fix critical findings, skip minor onesExample Prompts (copy-paste ready)
Section titled “Example Prompts (copy-paste ready)”Pre-Release Guide Audit
Section titled “Pre-Release Guide Audit”> Create a pre-release audit team:> - Accuracy: Check all claims, stats, version numbers, external links in CHANGELOG.md, README.md, and guide/ultimate-guide.md> - Consistency: Verify template count (count examples/ files), eval count (count docs/resource-evaluations/ files), guide lines (wc -l guide/ultimate-guide.md) match across README.md, guide/cheatsheet.md, machine-readable/reference.yaml> - Breaking: Identify breaking changes vs v3.23.0 by analyzing CHANGELOG.md [Unreleased] sectionSecurity PR Review (RTK)
Section titled “Security PR Review (RTK)”> Review PR #42 with scope-focused analysis:> - Rust Scope: Check ownership patterns, error handling (anyhow/thiserror), idiomatic code in modified files (context: src/**/*.rs)> - Security Scope: Scan for injection risks, token leaks, input sanitization (context: auth, input handling code)> - Performance Scope: Review allocations, async patterns, compiled regex (context: hot paths, loops)Multi-File Doc Update
Section titled “Multi-File Doc Update”> Update documentation for new "Agent Teams Quick Start" feature:> - Content Scope: Write guide/workflows/agent-teams-quick-start.md with overview, 4 patterns, decision matrix, metrics> - Index Scope: Update guide/ultimate-guide.md (add reference section 9.20), machine-readable/reference.yaml (add entry), CHANGELOG.md (add "Added" entry)> - Consistency Scope: Verify all cross-refs work, line numbers match, no broken links (context: all modified files)Landing Sync Validation
Section titled “Landing Sync Validation”> Validate guide/landing synchronization:> - Guide scanner: Extract version from VERSION file, template count (find examples/ -type f | wc -l), eval count (find docs/resource-evaluations/ -name "*.md" | wc -l), guide lines (wc -l guide/ultimate-guide.md)> - Landing scanner: Check /Users/florianbruniaux/Sites/perso/claude-code-ultimate-guide-landing/index.html and examples.html for version, counts, guide lines approximation5. Success Metrics
Section titled “5. Success Metrics”How to Measure Agent Teams ROI
Section titled “How to Measure Agent Teams ROI”| Metric | Target | How to Measure |
|---|---|---|
| Convergence rate | >50% | Count findings flagged by 2+ agents / total findings. High convergence = high confidence. |
| Unique insights | Each agent ≥1 | Each agent must find at least 1 unique issue in their domain. If agent finds 0 unique = wasted tokens. |
| False positive rate | <20% | Count invalid findings / total findings. Too many false positives = poor prompts. |
| Time saving | 60-70% | Compare agent teams time vs sequential (estimate 3x single-agent time for 3 tasks). |
| Bug catch rate | >80% | Count critical bugs found by agents / total bugs found post-ship. High = effective prevention. |
Real Example: Pre-Release Review Test (2026-02-08)
Section titled “Real Example: Pre-Release Review Test (2026-02-08)”Task: Pre-release audit for v3.23.1Agents: 3 (accuracy-auditor, consistency-checker, breaking-checker)Duration: 4 min 12 sec
Findings (raw): 45 total├─ Accuracy: 12 (1 critical: LinkedIn URL malformed, 11 warnings)├─ Consistency: 18 (2 criticals: template count desync, line numbers outdated)└─ Breaking: 15 (0 breaking changes, 15 informational notes)
Findings (deduplicated): ~30 unique issues
Convergence analysis:├─ High confidence (2-3 agents): 4 issues│ ├─ Template count mismatch (consistency + accuracy)│ ├─ Line numbers outdated (consistency + accuracy)│ ├─ External link verification needed (accuracy + breaking)│ └─ Version sync across files (all 3 agents)└─ Unique insights: ├─ Accuracy: LinkedIn URL corruption (only this agent caught it) ├─ Consistency: TOC structure deviation (only this agent) └─ Breaking: Changelog format improvement suggestion (only this agent)
Metrics:├─ Convergence rate: 4/30 = 13% (lower than target, but 4 criticals flagged by multiple agents = high confidence on what matters)├─ Unique insights: 3/3 agents = 100% (each agent found unique issues in their domain)├─ False positive rate: 2/30 = 6.6% (below 20% target ✅)├─ Time saving: 4 min vs estimated 12 min sequential = 66% savings ✅├─ Bug catch rate: 3 critical bugs caught that would've shipped = prevented production issues ✅
Verdict: ✅ High value for pre-release auditsHow to Track Your Metrics
Section titled “How to Track Your Metrics”After each agent teams task:
- Count findings: Note raw findings per agent, then deduplicate
- Mark convergence: Which issues were flagged by 2+ agents?
- Check unique insights: Did each agent find at least 1 domain-specific issue?
- Verify false positives: How many findings were invalid/noise?
- Time comparison: Agent teams duration vs estimated sequential time
- Post-ship validation: Did agent teams catch bugs that would’ve shipped?
Keep a log (Markdown table in project docs):
| Date | Task | Agents | Duration | Findings | Convergence | Unique | False+ | Time Saved | Bugs Caught ||------|------|--------|----------|----------|-------------|--------|--------|------------|-------------|| 2026-02-08 | Pre-release v3.23.1 | 3 | 4m12s | 30 | 13% (4 critical) | 3/3 | 6.6% | 66% | 3 |Adjust prompts if metrics fail:
- Low convergence (<30%) → Scopes too narrow, overlap context boundaries more
- No unique insights → Scopes too similar, diversify analysis angles
- High false positives (>20%) → Prompts too vague, add concrete criteria
6. Limitations & Red Flags
Section titled “6. Limitations & Red Flags”What Agent Teams DON’T Do
Section titled “What Agent Teams DON’T Do”| Limitation | What It Means | Mitigation |
|---|---|---|
| 3x tokens | Each agent = separate model call = 3x cost | Reserve for high-stakes tasks (pre-release, security PRs, not typo fixes) |
| Idle spam | Agents show “Idle” messages during coordination | Normal behavior, not a bug, ignore the spam |
| Experimental | Research preview = stability not guaranteed | Expect bugs, don’t rely on agent teams for production-critical workflows |
| Coordination overhead | 3-5 agents max, not 10 (coordination complexity grows) | Stick to 2-4 agents, avoid “team of 10” prompts |
| Context isolation | Agents don’t see each other’s discoveries (work independently) | Claude synthesizes findings, but agents can’t build on each other’s work mid-task |
Red Flags: When NOT to Use
Section titled “Red Flags: When NOT to Use”❌ Simple task (<5 files, <100 lines, 1 domain)
- Example: Fix typo in README.md
- Why avoid: 3x tokens for 10-second task = waste
❌ Sequential workflow (step B depends on step A result)
- Example: Implement feature → Write tests → Deploy
- Why avoid: Agents work in parallel, can’t handle dependencies
❌ Budget tight (3x tokens, optimize cost)
- Example: Personal project with limited API credits
- Why avoid: Agent teams = luxury for high-stakes tasks, not daily workflow
❌ Write-heavy (many edits to same files)
- Example: Refactor entire codebase structure
- Why avoid: Merge conflicts, coordination overhead, agents stepping on each other
❌ Low-stakes review (internal PR, trusted contributor, simple change)
- Example: Team member fixes small bug
- Why avoid: Overkill, single-agent review faster + cheaper
When You SHOULD Use (despite costs)
Section titled “When You SHOULD Use (despite costs)”✅ High-stakes (production release, security-sensitive, external contributor) ✅ Multi-domain (Rust + Security + Performance = blind spots) ✅ Parallel-friendly (independent tasks, no sequential dependencies) ✅ Consistency-critical (cross-file sync, counts, versions) ✅ Learning opportunity (understand blind spots, improve prompts)
Summary: Quick Reference
Section titled “Summary: Quick Reference”Setup (5 min once)
Section titled “Setup (5 min once)”export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1claude> Are agent teams enabled? # VerifyWhen to Use (decision rule)
Section titled “When to Use (decision rule)”YES if:
- High stakes + Multi-domain + Parallel-friendly
NO if:
- Simple task + Sequential + Budget tight
Patterns (4 ready-to-use)
Section titled “Patterns (4 ready-to-use)”- Pre-Release Review (Guide) → Accuracy + Consistency + Breaking
- Landing Sync (Guide) → Guide scanner + Landing scanner
- Multi-File Doc Update (Guide) → Content + Index + Consistency
- Security PR Review (RTK) → Rust + Security + Performance
Metrics (track ROI)
Section titled “Metrics (track ROI)”- Convergence: >50% (findings by 2+ agents)
- Unique insights: Each agent ≥1
- False positives: <20%
- Time saving: 60-70%
- Bug catch: >80%
Red Flags (avoid waste)
Section titled “Red Flags (avoid waste)”- ❌ Simple task (<5 files)
- ❌ Sequential workflow
- ❌ Budget tight
- ❌ Write-heavy (merge conflicts)
Next Steps
Section titled “Next Steps”- Try first test (5 min setup + simple 2-agent task)
- Pick 1 pattern from your project (Guide or RTK)
- Measure metrics (convergence, unique insights, time saved)
- Adjust prompts based on results
- Read full doc for advanced patterns: Agent Teams
Got questions? Check full documentation for:
- Architecture deep-dive (how git coordination works)
- Advanced use cases (15+ production scenarios)
- Troubleshooting (common issues + solutions)
- Best practices (team size, prompt design, conflict resolution)