Memory Systems
Memory Systems
Section titled “Memory Systems”Confidence: Tier 1 (native stack, well-documented tools) / Tier 2 (newer tools, vendor benchmarks) / Tier 3 (emerging patterns, unverified claims)
Last updated: May 2026
Related: Context Engineering | Architecture | Settings Reference | Agent Teams
Memory in Claude Code has no single canonical source — it spans native CC features, MCP servers, hooks, and coordination protocols. This page consolidates everything.
Table of Contents
Section titled “Table of Contents”- TL;DR: Three-Track Model
- Native Claude Code Memory Stack
- Cross-Session Tools (Single User)
- Team Sharing
- Multi-Agent Shared Memory
- Architecture Patterns
- Risks and Security
- Decision Frameworks
- Benchmarks and Evaluation
- Open Problems
1. TL;DR: Three-Track Model
Section titled “1. TL;DR: Three-Track Model”Memory for Claude Code splits into three tracks. The native stack (CLAUDE.md, MEMORY.md, Auto Memory, Auto Dream) covers 80% of solo-dev needs with zero external tooling. Cross-session tools (claude-mem, agentmemory, ICM) handle compression, semantic recall, and multi-tool portability for individuals. Team sharing has no dominant solution — the gap is structural, not a maturity question, because every leading tool was built single-user-first.
┌──────────────────────────────────────────────────────────────────────────┐│ CLAUDE CODE MEMORY — 3-TRACK MODEL │├──────────────────┬───────────────────────┬───────────────────────────────┤│ NATIVE STACK │ CROSS-SESSION TOOLS │ TEAM / MULTI-AGENT │├──────────────────┼───────────────────────┼───────────────────────────────┤│ CLAUDE.md │ claude-mem │ CLAUDE.md + .mcp.json (static)││ MEMORY.md │ agentmemory │ Notion MCP (zero infra) ││ Auto Memory │ ICM │ Mem0 cloud (free tier) ││ Auto Dream │ OpenMemory MCP │ mcp-memory-service + CF ││ │ Kairn │ Zep / Graphiti (temporal) ││ │ doobidoo (local) │ agentmemory (shared server) │├──────────────────┼───────────────────────┼───────────────────────────────┤│ Covers 80% of │ Cross-device recall, │ Shared context, multi-agent ││ solo needs. │ multi-tool portable, │ coordination, temporal graph. ││ Zero infra. │ semantic search, │ Gap: structurally unfilled. ││ │ knowledge graph. │ │├──────────────────┼───────────────────────┼───────────────────────────────┤│ WRITE: Auto │ WRITE: hooks (best) │ WRITE: shared MCP server ││ READ: linear │ READ: MCP search │ READ: semantic + tag + graph ││ DECAY: 200 lines │ DECAY: importance wt │ DECAY: TTL / temporal edges ││ TEAM: No │ TEAM: No │ TEAM: partial / evolving │└──────────────────┴───────────────────────┴───────────────────────────────┘Three findings that change how you should think about memory:
Hook-driven writes beat MCP-driven writes as a default. Automatic extraction at zero token cost (hook mode) consistently outperforms 20-50 tokens per voluntary store call (MCP mode). agentmemory wires this correctly on install; ICM ships the hook but leaves it unwired in settings.json.
Shared memory is an unguarded write surface. Any team member, or their compromised dependency, can inject instructions that every future agent session in the fleet reads. No tool documented here has a mitigation for this. See Section 7.1.
Hybrid retrieval (BM25 + Vector + Graph via RRF fusion) delivers 9 percentage points more R@5 than pure vector search on reproducible benchmarks. Most tools in this ecosystem still default to pure vector.
2. Native Claude Code Memory Stack
Section titled “2. Native Claude Code Memory Stack”2.1 CLAUDE.md: Three Levels of Memory
Section titled “2.1 CLAUDE.md: Three Levels of Memory”CLAUDE.md files are persistent instructions Claude reads at the start of every session — long-term memory of your preferences, conventions, and project context.
┌─────────────────────────────────────────────────────────┐│ MEMORY HIERARCHY │├─────────────────────────────────────────────────────────┤│ ││ ~/.claude/CLAUDE.md (Global - All projects) ││ │ ││ ▼ ││ /project/CLAUDE.md (Project - This repo) ││ │ ││ ▼ ││ /project/.claude/CLAUDE.md (Local - Personal prefs) ││ ││ All files merged additively. ││ On conflict: more specific file wins. ││ │└─────────────────────────────────────────────────────────┘In monorepos, parent directory CLAUDE.md files are automatically included, and child directory files are loaded on demand when Claude works with files there. For personal instructions not committed to Git: /project/.claude/CLAUDE.md (add to .gitignore) or /project/CLAUDE.md.local (auto-gitignored by convention).
Minimum viable CLAUDE.md:
# Project Name
Brief one-sentence description.
## Commands- `pnpm dev` - Start development server- `pnpm test` - Run tests- `pnpm lint` - Check code styleClaude automatically detects the tech stack, directory structure, and existing conventions from the code. Add more only when needed: non-standard package manager, custom commands, architecture decisions not obvious from the code, or project-specific gotchas that conflict with common patterns.
The discoverability filter: before adding any line, ask “Can the agent find this by reading the codebase?” If yes, don’t add it. Tech stack and testing conventions are discoverable. What earns a line: tooling gotchas (use uv, not pip), operational landmines (legacy/ is deprecated but imported by prod), and conventions that conflict with standard patterns.
The anchoring risk: every entry loads every session regardless of task. A stale entry referencing a deprecated library biases the agent toward it on every prompt. Treat periodic CLAUDE.md pruning as maintenance, not cleanup.
Research note (Feb 2026): ETH Zürich evaluated agent context files across 138 benchmarks and 12 repositories. Developer-written files improve task success by ~4%, but LLM-generated files (
/initoutput) reduce it by ~3%. Both add 20-23% inference cost. Mechanism: agents follow every instruction, including those irrelevant to the current task. Source: Gloaguen et al., arXiv 2602.11988
When your project grows, structure around three layers:
## WHAT — Stack & Structure- Runtime: Node.js 20, pnpm 9- Framework: Next.js 14 App Router- DB: PostgreSQL via Prisma ORM
## WHY — Architecture Decisions- App Router for RSC + streaming support- No Redux: server state via React Query, local state via useState
## HOW — Working Conventions- Run: `pnpm dev` | Test: `pnpm test` | Lint: `pnpm lint --fix`- Commits: conventional format (feat/fix/chore)CLAUDE.md as compounding memory: Boris Cherny (creator of Claude Code) described the pattern — you should never correct Claude twice for the same mistake. CLAUDE.md grows through actual errors caught during development, not preemptive documentation. 2.5K tokens of accumulated context built over months means new team members benefit from tribal knowledge instantly.
Full documentation: Memory Files (CLAUDE.md)
2.2 Auto Memory (v2.1.59+)
Section titled “2.2 Auto Memory (v2.1.59+)”Not to be confused with Claude.ai memory: Claude.ai’s memory feature (Aug 2025 for Teams, Oct 2025 for Pro/Max) stores preferences in your claude.ai account. Claude Code’s auto-memory is a local, per-project feature managed via
/memory.
Claude Code automatically saves useful context across sessions without manual CLAUDE.md editing.
How it works: Claude identifies key context during conversations (decisions, patterns, preferences) and stores it in .claude/memory/MEMORY.md (project) or ~/.claude/projects/<path>/memory/MEMORY.md (global). Automatically recalled in future sessions for the same project. Manage with /memory: view, edit, or delete stored entries.
File limits (enforced at read time):
| Limit | Value | Behavior when exceeded |
|---|---|---|
MEMORY.md max lines | 200 lines | Truncated at line 200, warning appended |
MEMORY.md max size | 25 KB | Truncated at last newline before 25 KB |
| Memory directory | 200 files | Oldest files pruned when limit is reached |
Line truncation applies first; byte truncation applies afterward if still over 25 KB. Both truncations append a warning comment. The Auto Dream consolidation process keeps MEMORY.md under 200 lines as part of its Phase 4 pruning.
What gets remembered: architectural decisions (“We use Prisma for database access”), preferences (“This team prefers functional components”), project-specific patterns (“API routes follow RESTful naming in /api/v1/”), known issues (“Don’t use package X due to version conflict with Y”).
Difference from CLAUDE.md:
| Aspect | CLAUDE.md | Auto-Memories |
|---|---|---|
| Management | Manual editing | Automatic via /memory |
| Source | Explicit documentation | Conversation analysis |
| Visibility | Git-tracked, team-shared | Local per-user, gitignored |
| Worktrees | Shared (v2.1.63+) | Shared across same repo (v2.1.63+) |
| Best for | Team conventions | Personal workflow patterns, discovered insights |
Recommended workflow: CLAUDE.md for team-level conventions everyone must follow. Auto-memories for personal discoveries and session context. When in doubt, document in CLAUDE.md for team visibility.
2.3 Auto Dream: Memory Consolidation
Section titled “2.3 Auto Dream: Memory Consolidation”Community-discovered feature, not in official Anthropic release notes. Sourced from reverse-engineering by Piebald-AI/claude-code-system-prompts. Controlled by a server-side feature flag (
tengu_onyx_plover). Rolling out gradually as of v2.1.83+.
After 20+ sessions without curation, auto-memory degrades: stale context, contradictory facts, relative dates that lose meaning (“yesterday’s refactor” means nothing two weeks later). Auto Dream runs as a background sub-agent between sessions to consolidate and prune. The system prompt says literally: “You are performing a dream — a reflective pass over your memory files.”
Built on Auto Memory (v2.1.59+). Theoretical foundation: “Sleep-time Compute” (UC Berkeley + Letta, April 2025), which showed pre-computing during idle periods reduces test-time compute by ~5x. The biological parallel is deliberate.
Trigger conditions (both must be met):
| Condition | Default |
|---|---|
| Time since last consolidation | ≥ 24 hours |
| Sessions since last consolidation | ≥ 5 |
A lock file prevents concurrent runs on the same project.
The 4 phases:
| Phase | Name | What happens |
|---|---|---|
| 1 | Orient | Lists memory directory, reads index, skims existing topic files |
| 2 | Gather Signal | Targeted grep of session JSONL transcripts — not exhaustive reads. “Look only for things you already suspect matter.” |
| 3 | Consolidate | Merges new signal, converts relative dates to absolute, removes contradicted facts, deduplicates |
| 4 | Prune & Index | Rebuilds MEMORY.md under 200-line cap, removes stale pointers, enforces index entry format |
Observed performance: One documented run consolidated 913 sessions in ~9 minutes. Typical result: MEMORY.md goes from 280+ lines to ~140 lines.
Safety constraints: Read-only on project source code. Write access limited to memory files only.
How to access: /memory shows AutoDream status and toggle. The /dream command is referenced in the UI but returns “Unknown skill: dream” on most installations (issues #38461, #38426 — fix tracked in PR #39299). Manual trigger via natural language instead:
"dream""auto dream""consolidate my memory files"Known quality gaps (issue #38493, March 2026):
| Gap | Problem | Example |
|---|---|---|
| Identity | Names memory files from session content, not project path | Rename my-old-project/ → orphaned files undetected |
| Accuracy | Writes unverified facts without reading source files | ”18 of 21 items resolved” written without checking |
| Transparency | No audit trail | Must compare folders before/after to understand a run |
When Auto Dream matters: projects where memory is written but never manually curated — active teams, long-running projects with 50+ sessions, or any context where MEMORY.md exceeds 150 lines with no cleanup. If you actively manage memory files, Auto Dream is largely redundant.
Community implementations: dream-skill (open-source replication) and ai-dream (alternative implementation).
Full documentation: Auto Dream in the Ultimate Guide
2.4 Agent Memory Frontmatter
Section titled “2.4 Agent Memory Frontmatter”Introduced in Claude Code v2.1.33 (February 2026), the memory frontmatter field gives subagents persistent, markdown-based knowledge that survives across sessions.
Each memory system in Claude Code serves a distinct purpose:
| System | Written by | Read by | Scope | Persists |
|---|---|---|---|---|
| CLAUDE.md | You (manually) | Main Claude + all agents | Project or global | Git-tracked |
| Auto-memory | Main Claude (automatic) | Main Claude only | Per-project per-user | Gitignored |
| Agent memory | The agent itself | That specific agent only | Configurable | Depends on scope |
Memory scopes:
| Scope | Storage | Version controlled | Best for |
|---|---|---|---|
user | ~/.claude/agent-memory/<name>/ | No | Cross-project learning |
project | .claude/agent-memory/<name>/ | Yes | Team-shared conventions |
local | .claude/agent-memory-local/<name>/ | No | Personal, project-specific |
Activate with one frontmatter line:
---name: code-reviewerdescription: Reviews code for quality and consistencytools: Read, Grep, Globmemory: user---When an agent starts, Claude Code reads the first 200 lines of MEMORY.md in the agent’s memory directory and injects them into the system prompt automatically.
Full documentation: Agent Memory Frontmatter (4.5)
2.5 Session vs Persistent Memory
Section titled “2.5 Session vs Persistent Memory”| Aspect | Session Memory | Auto-Memory | Persistent Memory |
|---|---|---|---|
| Scope | Current conversation | Across sessions, per-project | Across all sessions |
| Managed by | /compact, /clear | /memory (automatic) | write_memory() via Serena MCP |
| Lost when | Session ends | Explicitly deleted | Explicitly deleted |
| Requires | Nothing | Nothing (v2.1.59+) | Serena MCP server |
| Use case | Immediate context | Key decisions for next session | Structured architectural decisions |
Auto-compact and memory capture conflict: Claude Code auto-compacts when remaining context drops below ~6-7% of the context window. If you use a hook-based memory capture tool (claude-mem, agentmemory) that saves via PostToolUse, auto-compact can fire and discard conversation history before the save pipeline captures it.
Two mitigation paths:
// Option 1: disable auto-compact (you own the timing){ "autoCompactEnabled": false }# Option 2: configure your tool's save threshold below 80%# so capture runs before auto-compact would trigger2.6 Limits of the Native Stack
Section titled “2.6 Limits of the Native Stack”Five gaps that external tooling addresses:
- Local per machine: MEMORY.md and Auto-memories are not shared across devices or people.
- No semantic retrieval: Memory is loaded linearly from the top of the file. There is no search by meaning.
- Cross-project aggregation: What you learned about connection pooling in project A is not available in project B.
- Auto Dream triggers slowly: Requires ≥5 sessions and ≥24 hours. Fresh projects get no consolidation benefit.
- Agent teams don’t share session history: In dispatch mode, sub-agents have no shared context. CLAUDE.md is the only shared layer.
3. Cross-Session Tools (Single User)
Section titled “3. Cross-Session Tools (Single User)”3.1 claude-mem
Section titled “3.1 claude-mem”Repo: github.com/thedotmack/claude-mem | Stars: ~26.5K | License: AGPL-3.0 + PolyForm Noncommercial
Hooks into Claude Code lifecycle events (SessionStart, PostToolUse, Stop, SessionEnd). Records observations during sessions, semantically compresses them using an LLM worker (Bun, port 37777), stores results in SQLite plus optional Chroma vector search. Injects relevant context back at session start or when the agent faces a relevant task.
Key differentiator: compression and relevance filtering, not transcript storage. Distilled semantic summaries, injected only when relevant. Local-first. Uses Claude Haiku for summarization by default; configurable to Gemini 2.5 Flash Lite for cost reduction (up to 86% cheaper for heavy users).
Installation:
/plugin marketplace add thedotmack/claude-mem/plugin install claude-mem# Restart Claude CodeObservation types captured:
| Type | When | Example |
|---|---|---|
DISCOVERY | Reading/exploring code | ”Explored auth module, found JWT in validateToken()“ |
CHANGE | File edits | ”Modified session.middleware.ts: added refresh logic” |
FEATURE | New functionality | ”Implemented OAuth2 flow in auth.service.ts” |
BUGFIX | Bug corrections | ”Fixed null pointer in UserController.getById()” |
Progressive disclosure (3 layers to save tokens):
Layer 1: Search (50-100 tokens) → 5 relevant session summariesLayer 2: Timeline (500-1000 tokens) → chronological observation listLayer 3: Details (full context) → complete tool call + result~10x token reduction vs loading full session history.
Security warning: GET /api/settings returns API keys in plain text. Set host: "127.0.0.1" (not "0.0.0.0"). Never run on a shared machine.
Cost: ~$0.15 per 100 observations (AI summarization). Typical: $5-15/month for heavy users (100+ sessions). Switch to Gemini 2.5 Flash Lite to cut to ~$14/month at 400 sessions.
Hooks coexistence gotcha: claude-mem will overwrite your existing settings.json hooks arrays, not merge with them. Back up settings.json before installing, then manually verify both your existing hooks and the new claude-mem hooks are present.
Fail-open architecture (v9.1.0+): if the worker process is down, Claude Code continues normally — sessions simply aren’t captured until the worker restarts.
Limitations: CLI only, no cloud sync, AGPL-3.0 license requires compliance review for commercial use.
3.2 agentmemory
Section titled “3.2 agentmemory”Repo: github.com/rohitg00/agentmemory | Stars: 16,167 (May 2026, trending on Trendshift) | License: Apache 2.0 | Language: TypeScript
Memory server running on port 3111 with a real-time viewer on port 3113. Zero external dependencies — SQLite plus the in-house iii engine. No Cloudflare, no Neo4j, no Docker.
Installation:
npm install -g @agentmemory/agentmemoryagentmemoryagentmemory connect claude-code # auto-wires 12 hooks into settings.jsonHybrid search (the key differentiator): BM25 + Vector + Graph fused via Reciprocal Rank Fusion. Four-tier memory consolidation with decay and auto-forget. The agentmemory connect claude-code command auto-wires 12 hooks (PostToolUse, SessionStart, SessionEnd, and others) into settings.json — the hook problem that ICM leaves unsolved.
Benchmarks (reproducible via eval/README.md, published corpus coding-agent-life-v1):
| System | R@5 (LongMemEval-S) | R@10 | MRR |
|---|---|---|---|
| agentmemory | 95.2% | 98.6% | 88.2% |
| BM25-only | 86.2% | 94.6% | 71.5% |
| mem0 (their harness) | 68.5% | — | — |
| Letta (their harness) | 83.2% | — | — |
The corpus and adapter code are published so numbers can be verified independently. This is more than most tools offer. Note that competitor comparisons (mem0, Letta) are run by the tool’s author — methodology disclosed but not independently audited.
Token cost: 170K tokens/year ($10) vs ~650K for LLM-summarized summaries.
Multi-agent coordination: MCP + REST + leases + signals. Leases allow agents to lock a memory region while working on it. Signals allow agents to notify each other of state changes. “All agents share the same memory server.”
For team use: the server listens on port 3111. If deployed on a shared host and exposed internally, all team members’ agents point to the same instance. Not explicitly documented in the README for the team scenario, but the architecture supports it.
Agent support: Claude Code (native plugin + 12 hooks + MCP), Codex CLI (6 hooks + MCP), OpenCode (22 hooks), Cursor, Gemini CLI, Claude Desktop, Windsurf, Cline, Goose, Roo Code (MCP), Aider (REST API).
Honest caveats: Benchmarks use an in-house corpus alongside LongMemEval-S. The iii engine is not a third-party project. Production REX is limited given the tool’s recent release.
3.3 ICM (Infinite Context Memory)
Section titled “3.3 ICM (Infinite Context Memory)”Install: brew tap rtk-ai/tap && brew install icm | Version: 0.10.49 (May 2026) | License: Source-Available (free for teams ≤20)
DB location (macOS): ~/Library/Application Support/dev.icm.icm/memories.db
ICM has three distinct operating modes that are easy to conflate:
MCP mode (icm init --mode mcp): Exposes 31 MCP tools via stdio. Every tool call costs 20-50 tokens. The default configuration. Claude calls icm_memory_recall at session start and icm_memory_store when triggered by the MCP instructions (5 standard triggers: resolved error, architecture decision, discovered preference, completed significant task, ~20 tool calls without a store).
Hook mode (icm init --mode hook): Direct CLI, ~30ms latency, zero token overhead. Extraction is rule-based, fires automatically every N tool calls (default 15). The hook file ships with ICM at ~/.claude/hooks/icm-post-tool.sh after installation, but must be manually registered in ~/.claude/settings.json to activate. If not in settings, it never fires.
To activate hook mode, add to ~/.claude/settings.json:
{ "hooks": { "PostToolUse": [ {"matcher": "*", "hooks": [{"type": "command", "command": "~/.claude/hooks/icm-post-tool.sh"}]} ] }}This is the single highest-ROI configuration change for ICM users. Hook mode at zero tokens vs. 20-50 tokens per MCP call is the difference between memory that runs automatically and memory that only persists when Claude explicitly decides to call icm_memory_store.
Skills mode (icm init --mode skill): Installs /recall and /remember slash commands.
Two memory types:
Memories are episodic with temporal decay. Importance levels control decay rate: critical never decays, high decays slowly, medium follows normal decay, low fades quickly. Frequently recalled memories also decay more slowly. Consolidation triggers when a topic exceeds seven entries.
Memoirs are a permanent knowledge graph — concepts linked by typed relations (depends_on, contradicts, superseded_by, plus 6 others). Unlike memories, memoirs do not decay. This is the part of ICM that answers the “shared graph for multi-agent communication” use case: multiple agents writing to the same memoir create a persistent relationship map for the project.
icm memoir create -n "project-arch"icm memoir add-concept -m "project-arch" -n "auth-service"icm memoir add-concept -m "project-arch" -n "user-service"icm memoir link -m "project-arch" --from "auth-service" --to "user-service" -r depends-onicm memoir export -m "project-arch" -f asciiCross-tool reach: the same SQLite database is shared across 17 tools after icm init — Claude Code, Gemini CLI, Cursor, Codex, Windsurf, VS Code, Zed, Amp, Continue.dev, Aider, and others.
Benchmarks (vendor-claimed, unverified independently): 100% LongMemEval recall. 5% factual accuracy without ICM vs. 68% with ICM in their evaluation. 29-40% fewer turns in multi-session tests.
Team use: not designed for it. Each machine has its own local SQLite. No shared server mode exists in ICM.
3.4 Kairn
Section titled “3.4 Kairn”Repo: github.com/kairn-ai/kairn | License: MIT | Language: Python
Long-term project memory organized as a knowledge graph with automatic biological decay — stale information expires on its own, preventing context pollution.
| Feature | doobidoo | Kairn |
|---|---|---|
| Storage model | Semantic embeddings | Knowledge graph |
| Memory decay | No | Yes (biological) |
| Typed relationships | Tags only | depends-on / resolves / causes |
| Auto-pruning stale info | No | Yes |
Solutions persist ~200 days; workarounds persist ~50 days. 18 MCP tools covering graph ops, project tracking, experience management, and an intelligence layer (full-text search, confidence routing, cross-workspace patterns).
When Kairn makes sense: long-running projects where workarounds from months ago become noise; when causality matters (“this breaks because of that”); teams wanting automatic knowledge hygiene without manual cleanup.
pip install kairn# or: git clone https://github.com/kairn-ai/kairn && pip install -e .3.5 doobidoo mcp-memory-service
Section titled “3.5 doobidoo mcp-memory-service”Repo: github.com/doobidoo/mcp-memory-service | Version: v10.0.2 | Stars: ~1.6K (May 2026) | License: MIT
Semantic memory with cross-session search and multi-client support (13+ AI tools). Moved from ChromaDB to SQLite-vec at v8.0.0 (breaking change). Default backend is sqlite_vec.
pip install mcp-memory-servicepython -m mcp_memory_service.scripts.installation.install --quickKey difference from Serena: Serena uses key-value memory (requires knowing the key). doobidoo uses semantic search (retrieve_memory("what did we decide about auth?")) — finds by meaning.
Storage backends:
| Backend | Usage | Best for |
|---|---|---|
sqlite_vec (default) | Local, lightweight | Solo dev, single machine |
cloudflare | Cloud, multi-device sync | Team sharing, multi-device |
hybrid | Local fast + cloud background sync | Best of both |
Known issues (from GitHub history, May 2026):
SQLite concurrent access: When multiple clients access the same database simultaneously, the default busy_timeout=5000ms is too short. Produces intermittent errors in team scenarios. Fix: MCP_MEMORY_SQLITE_PRAGMAS=busy_timeout=15000,cache_size=20000 in your .env. The v8.9.0 installer sets this for new installs; upgrades require manual configuration.
ChromaDB migration (v8.0.0): Breaking change when upgrading from v7.x. Must migrate rather than upgrade directly.
Consolidation was blocked: The consolidation system was non-functional for a period due to missing update_memory() implementations. Fixed in October 2025 commits. If consolidation seems not to run, verify you are on a post-October 2025 build.
Backend mismatch: When MCP server and HTTP dashboard use different MCP_MEMORY_STORAGE_BACKEND values, they access different databases. Always verify /api/health/detailed shows the expected backend.
OAuth scope errors: Users report 403 Forbidden after OAuth flow due to token scope issues. OAuth is disabled by default (MCP_OAUTH_ENABLED=false).
Full documentation (team config): See Section 4.2
3.6 OpenMemory MCP
Section titled “3.6 OpenMemory MCP”Repo: github.com/mem0ai/mem0/tree/main/openmemory | Dashboard: http://localhost:3000
User-owned, local-first, private memory layer. Standardized 4-tool interface:
add_memoriessearch_memorylist_memoriesdelete_all_memories
Works across Claude Desktop, Cursor, Windsurf, and Cline. The design goal: a single portable personal memory layer across all AI tools. If you use multiple AI assistants, OpenMemory MCP provides shared persistence without per-tool setup.
The 4-tool surface is the correct architectural answer to the “53-tool memory MCP” problem (see agentmemory). Every tool schema loads into context each turn — a minimal surface costs minimal overhead.
3.7 Other Notable Tools
Section titled “3.7 Other Notable Tools”| Tool | Stars | Key feature | Limitation |
|---|---|---|---|
| claude-memory-compiler | ~1.1K | Human-readable daily logs + concept KB | PostSession only, no real-time |
| mcp-memory (Puliczek) | — | Cloudflare D1 + Vectorize, cross-device | Network latency per retrieval |
| Claude Continuity | — | Zero-config, full-state fidelity (not compression) | [UNVERIFIED — repo handle not confirmed] |
| MemPalace | ~52.6K [UNVERIFIED] | Wings/rooms/drawers hierarchical index | 96.6% R@5 claim [UNVERIFIED] |
| Memori | 14.7K | Memory neighborhoods, team design goal | CC adapter (memori-mcp) at 3 stars |
| codebase-memory-mcp | 2.5K | AST tree-sitter graph, 155 languages, structural | Code structure only, not episodic |
| Pieces for Developers | — | 9-month rolling capture, IDE + browser + terminal | Individual-only, commercial |
| claude-session-continuity-mcp | — | 24 tools, auto error-to-solution pipeline | [UNVERIFIED — not confirmed by internal sources] |
Memori (MemoriLabs) deserves special attention: 14,730 stars, LLM-agnostic, converts execution history into structured persistent state via a graph + vector hybrid. Team-scoped “memory neighborhoods” are a design goal, not an afterthought. The gap is the CC adapter — memori-mcp is a separate repo with 3 stars and sparse documentation. Worth tracking.
codebase-memory-mcp (DeusData) solves a different problem: not “what did we discuss” but “what is the structure of this codebase.” Claims sub-millisecond queries. Can index the Linux kernel (~28M lines, 75K files) in ~3 minutes. Zero config Claude Code integration via MCP. The “99% fewer tokens” claim needs independent verification; the structural approach is sound.
3.8 Master Comparison Table
Section titled “3.8 Master Comparison Table”| Tool | Storage | Search | Auto hooks | Team | Token cost/yr |
|---|---|---|---|---|---|
| claude-mem | SQLite + Chroma (opt) | Semantic | Yes (auto-register) | No | ~$60-180 |
| agentmemory | SQLite + iii engine | BM25+Vec+Graph RRF | Yes (12, auto-wired) | Shared server | ~$10 |
| ICM | SQLite | Vector + BM25 | Ships unwired | No (local only) | 20-50 t/call |
| Kairn | Knowledge graph | Full-text + semantic | No | No | Low |
| doobidoo | SQLite-vec / CF D1 | Semantic | No | CF backend required | Low |
| OpenMemory MCP | Local SQLite | Vector | No | No | Minimal |
| Memori | Graph + Vec hybrid | Graph + Vec | No | Design goal | — |
| codebase-memory-mcp | AST graph | Structural | No | Filesystem share | Minimal |
| Pieces | Local proprietary | ML | No | No (privacy-first) | Daemon overhead |
4. Team Sharing
Section titled “4. Team Sharing”The native CLAUDE.md provides shared static context (versioned, zero infra). For shared dynamic memory that evolves during sessions, no single dominant solution exists as of May 2026.
4.1 The Trinity: CLAUDE.md + .mcp.json + /skills
Section titled “4.1 The Trinity: CLAUDE.md + .mcp.json + /skills”The most adopted team pattern in 2025-2026. Three files in the repo root, all versioned in Git:
<repo>/├── CLAUDE.md # coding standards, guardrails, agent behavior├── .mcp.json # MCP server configs (DBs, ticketing, memory servers)└── .claude/ └── skills/ # shared workflows as markdown skillsEvery developer who clones and runs Claude Code in that repo inherits the full stack with no per-developer setup. Rules like “never modify CI files,” “always run tests after changes,” and “use read-only DB access” are centralized here.
CLAUDE.md best practice: keep under 2,000 words, link to details rather than embed them, prioritize signal density.
This covers shared standards and conventions with zero infrastructure. For shared dynamic memory (decisions made during sessions, context discovered by agents), you need an additional layer.
4.2 doobidoo + Cloudflare (Team Mode)
Section titled “4.2 doobidoo + Cloudflare (Team Mode)”The recommended production path for teams using doobidoo is the Cloudflare backend (Vectorize + D1 + Workers AI), which requires a Cloudflare account with appropriate access enabled.
{ "mcpServers": { "memory": { "command": "memory", "args": ["server"], "env": { "MCP_MEMORY_STORAGE_BACKEND": "hybrid", "MCP_HTTP_ENABLED": "true", "MCP_HTTP_PORT": "8000", "CLOUDFLARE_API_TOKEN": "your-token", "CLOUDFLARE_ACCOUNT_ID": "your-account-id", "MCP_MEMORY_SQLITE_PRAGMAS": "busy_timeout=15000,cache_size=20000", "MCP_OAUTH_ENABLED": "false" } } }}Gotchas for team deployment: SQLite shared across machines requires WAL mode configuration and the busy_timeout fix above. Sharing a plain SQLite file over a network filesystem (NFS, SMB) will corrupt the database — SQLite locking assumes local fcntl. If multiple developers write from different machines, use the Cloudflare backend. This is not free and not zero-config.
4.3 Mem0 Cloud MCP
Section titled “4.3 Mem0 Cloud MCP”Repo: github.com/mem0ai/mem0 | Stars: ~55K (full repo) | Free tier: yes
Cloud-hosted MCP server. No local installation, no infrastructure to manage. One-line setup:
npx mcp-add --name mem0-mcp --type http \ --url https://mcp.mem0.ai/mcp \ --clients "claude code,cursor,windsurf"Each team member adds the same URL to their .mcp.json. Memory scope (individual vs. shared) is controlled by user_id: use a project-scoped ID to give the team a common pool.
11 MCP tools: add_memory, search_memories, get_memories, update_memory, delete_memory, delete_all_memories, plus entity management and event tracking. Wildcards (user_id: "*") search across all users.
When to use: quickest path to working shared memory layer. Zero configuration gap between team members.
When not to use: codebases containing proprietary logic or client information. Data lives on Mem0’s infrastructure — real consideration for privacy-sensitive projects.
4.4 Zep / Graphiti
Section titled “4.4 Zep / Graphiti”Repo: github.com/getzep/graphiti | Stars: ~24.5K | Pricing: self-hosted (Neo4j, free) or cloud ($25-$475/month)
If the requirement is not just “remember context” but “understand how context changed over time,” Graphiti is the only option with a clear answer. It builds a temporal knowledge graph on top of Neo4j: nodes are entities, edges are relationships, and every edge carries a validity window. A query like “what did the team decide about the auth service in March, before the direction change in April?” is answerable. Standard semantic vector search cannot do this.
9 MCP tools. Graph traversal enables entity-centric retrieval, relationship chains, and temporal constraints.
Bitemporal modeling: the technique comes from data warehousing (Snodgrass, 1999). Every other tool in this survey treats memory as a flat snapshot — they cannot answer historical questions about superseded decisions.
Setup requires Neo4j: a full database service. Self-hosting is reasonable for teams with existing infrastructure. The cloud tier removes that constraint at $25-475/month.
4.5 Notion MCP
Section titled “4.5 Notion MCP”If your team already uses Notion, Claude can read and write pages via MCP tools (mcp__notion__*). Decisions, notes, and context are stored as human-readable pages. Zero new infrastructure, works today, has a web UI for non-Claude access.
This is Pattern B (Option 1) from the implementation patterns: no additional configuration beyond the Notion MCP server already in your .mcp.json.
4.6 Team-Native Tools (2026)
Section titled “4.6 Team-Native Tools (2026)”Three tools designed specifically for multi-user scenarios, all released in 2026. Insufficient community feedback for confident recommendations.
Memlord (memlord.com): self-hosted, full user isolation, shared workspaces with invite links. Multi-user is a first-class architectural feature, not a configuration option. Stars unknown, recently launched.
Pindoc (community listing, PulseMCP): “Code-pinned team memory for AI coding agents.” Typed artifacts, MCP-native, self-hosting. Released April 2026. [UNVERIFIED — repo handle not independently confirmed]
Artel (NicolasPrimeau): “Self-hosted shared memory and coordination mesh for AI agent fleets, with semantic search, task management, and async coordination.” Released May 2026, 210 stars at listing time. [UNVERIFIED — too new for community feedback]
Memlord has the clearest multi-user model from available information. Artel targets agent fleet coordination most directly. Neither is ready for production CC use without personal evaluation.
4.7 Why the Team Gap Is Structural
Section titled “4.7 Why the Team Gap Is Structural”Section 10 documents the gap explicitly. The conventional read is “the market will mature.” Six architectural barriers explain why iteration on existing tools will not close it:
Single-tenant DNA: Tools start as solo-developer projects. Their core abstraction is a SQLite file on a laptop. Retrofitting multi-user requires rewriting the storage model, auth model, and deployment model simultaneously. Cheaper to start a new tool, which is exactly what Memlord, Pindoc, and Artel did.
OAuth 2.1 is months of engineering: Implementing PKCE, refresh tokens, scoped permissions, and multi-IdP support is 3-6 months for a memory tool. Most authors don’t have that runway. doobidoo has the flag (MCP_OAUTH_ENABLED) but ships it disabled.
Privacy and sharing can’t coexist cheaply: Privacy-first tools mean local SQLite, which means no sharing. Cloud-shared tools mean vendor data residency. End-to-end encrypted shared memory with client-side key management — the answer to both requirements — is not implemented by any tool here.
No team taxonomy standard: Mem0 uses user_id with wildcards. Memlord uses workspaces. ICM has no team primitive. Zep has graph-scoped permissions. No interoperability is possible without a standard.
No enterprise buyer yet: Memory tools are bought by individuals at $0-25/month. Team memory needs SOC 2, SSO, audit logs, and an admin console. Only Zep cloud at $475/month is attempting this.
Claude Code’s multi-agent model is young: Agent teams are a recent feature. Building shared memory on an evolving substrate has been correctly deferred by teams who could have done it.
The correct bet today: CLAUDE.md + .mcp.json (static, zero infra) for shared standards, plus Notion MCP or Mem0 cloud (zero new infra) for dynamic shared memory. Plan a migration to agentmemory on a shared host or Memori when those tools’ team stories mature.
5. Multi-Agent Shared Memory
Section titled “5. Multi-Agent Shared Memory”5.1 MCP as Blackboard
Section titled “5.1 MCP as Blackboard”Classical AI blackboard architecture applied to agent swarms. Multiple agents read and write a shared semantic store via MCP tool calls — each agent deposits observations, other agents query by semantic search or tag.
Tools implementing this pattern: shared-memory-mcp (evalops), Agent-MCP (rinadelph), agentmemory (port 3111 as shared server), Mem0 cloud (shared user_id).
Limitation: MCP was designed for user-to-agent context access, not agent-to-agent coordination. The A2A protocol (Section 5.5) exists precisely because MCP alone is insufficient for this use case. Using MCP as a coordination bus is a workaround.
5.2 Neo4j Agent Memory
Section titled “5.2 Neo4j Agent Memory”Three-layer graph architecture with Python package:
| Layer | Content | Purpose |
|---|---|---|
| Short-term | Conversation session state, working memory | Active task context |
| Long-term | Extracted entities, relationships, facts | Persistent knowledge |
| Reasoning | Tool call traces, decision steps, why | Audit trail for handoffs |
The reasoning layer is the key differentiator. When agent team B takes over from agent team A, they can read exactly which tools were called, what was tried, and why decisions were made. Background entity extraction jobs continuously transform short-term into long-term. [Demo: youtube.com/watch?v=qMV64p-4Deo — UNVERIFIED]
Security note: the reasoning memory layer stores exactly what an attacker who breaches the memory server wants — the why behind every action. No tool documents encryption-at-rest for reasoning traces. See Section 7.4.
5.3 Leases and Signals (agentmemory)
Section titled “5.3 Leases and Signals (agentmemory)”Beyond shared storage, agentmemory implements coordination primitives:
- Leases: an agent can lock a memory region while working on it, preventing write conflicts with other agents on the same task.
- Signals: agents can notify each other of state changes asynchronously — effectively pub/sub on top of the memory server.
The architecture is a correct direction: distributed locks + pub/sub. Real concerns not yet documented: lease expiry policy when an agent crashes mid-lease, signal delivery guarantees (at-most-once vs. at-least-once), and behavior when agents run on different machines with clock skew. The primitives are sound; the implementation needs the same scrutiny a distributed-systems library would receive.
5.4 ICM Memoirs for Inter-Agent Graphs
Section titled “5.4 ICM Memoirs for Inter-Agent Graphs”In a local multi-agent setup where all sub-agents share the same host, ICM Memoirs create a persistent, typed relationship graph that survives session boundaries:
# Agent A writes during its sessionicm memoir create -n "project-arch"icm memoir add-concept -m "project-arch" -n "auth-service"icm memoir link -m "project-arch" --from "auth-service" --to "api-gateway" -r depends-on
# Agent B reads in its session (same host, same ICM DB)icm memoir export -m "project-arch" -f asciiThis works because ICM’s SQLite database is shared across all 17 configured tools on the same machine. Cross-machine sharing requires copying the database file manually — there is no sync protocol.
5.5 A2A Protocol
Section titled “5.5 A2A Protocol”Source: developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability
Complement to MCP. MCP = user to agent (context access). A2A = agent to agent (collaboration). Agents request help, share findings, and coordinate via function calls. The handoff mechanism allows one agent to transfer control to another without user intervention.
Adoption accelerated across frameworks in 2025-2026. The relationship: MCP handles tool access to external systems; A2A handles agent-to-agent communication and task delegation. Memory sharing between agents should eventually be designed around A2A primitives rather than bolted on top of MCP.
6. Architecture Patterns
Section titled “6. Architecture Patterns”Five patterns crystallize from this survey. They don’t compose well together — most tools implement exactly one.
6.1 Hook-Driven Lifecycle Compression
Section titled “6.1 Hook-Driven Lifecycle Compression”Tools: claude-mem, claude-memory-compiler, agentmemory
The agent session is treated as an event stream. Hooks fire at SessionStart/PostToolUse/Stop; an extractor pulls signal from noise; a compressed representation is injected at the next session. This is event-sourcing with LLM-based projection.
The winning pattern in 2026 because it requires zero agent cooperation — the agent doesn’t need to decide to remember. Automatic extraction at zero token cost beats voluntary MCP store calls for any high-frequency scenario. For the write path, this should be the default.
6.2 MCP as Blackboard
Section titled “6.2 MCP as Blackboard”Tools: shared-memory-mcp, Agent-MCP, agentmemory, Mem0 cloud
Multiple agents read and write a shared semantic store via MCP tool calls. Classical AI blackboard pattern resurrected. The architectural limit: MCP was designed for user-to-agent context access, not agent-to-agent coordination. Works in practice; the design is a workaround.
6.3 Hierarchical Episodic + Permanent Graph
Section titled “6.3 Hierarchical Episodic + Permanent Graph”Tools: ICM (Memories + Memoirs), Neo4j Agent Memory (3-layer: short-term / long-term / reasoning)
Two-tier or three-tier separation: ephemeral data decays on importance and recency, structural knowledge persists permanently. ICM’s Memoirs with typed relations (depends_on, contradicts, superseded_by) are the cleanest minimal implementation. The Neo4j three-layer model with dedicated reasoning memory is the most complete — but also the most operationally expensive.
6.4 Temporal Knowledge Graph
Section titled “6.4 Temporal Knowledge Graph”Tools: Zep / Graphiti (only)
Every edge carries a validity window. A query like “what did we believe before the April pivot?” is answerable. Every other tool in this survey treats memory as a flat snapshot — they cannot answer historical questions about superseded decisions.
The bitemporal model from data warehousing (Snodgrass, 1999) applied to agent memory. Required when organizations pivot and need a point-in-time view of past decisions. Overkill for straightforward session continuity.
6.5 Hybrid Retrieval Fusion (RRF)
Section titled “6.5 Hybrid Retrieval Fusion (RRF)”Tools: agentmemory (BM25 + Vector + Graph)
Three indexes fused via Reciprocal Rank Fusion: score = sum(1 / (k + rank_i)) across each retriever, sorted by fused score. The lift from BM25 + Vector + Graph over BM25-only: 9 percentage points R@5 (95.2% vs 86.2%) on reproducible benchmarks. That translates to a 65% reduction in retrieval misses.
RRF is ~20 lines of code. The cost is maintaining three indexes in sync. Most tools optimized for install simplicity skip this. Pure vector search loses to hybrid in practice — the embedding smears exact-match signal (function names, error strings) across the vector space.
6.6 What’s Missing Architecturally
Section titled “6.6 What’s Missing Architecturally”No tool implements write-ahead conflict resolution for multi-writer scenarios. No tool implements memory provenance (which agent wrote this, based on what evidence, linked to which session). No tool implements causal consistency across distributed agents. The field is at “shared key-value with timestamps” sophistication.
The tools winning in 2026 (claude-mem, agentmemory) won because they solved lifecycle integration via hooks, not because they solved coordination. The next generation will solve coordination.
6.7 Storage Backend Right-Sizing
Section titled “6.7 Storage Backend Right-Sizing”| Scenario | Correct backend | Avoid |
|---|---|---|
| Solo, single machine | SQLite-vec | ChromaDB (fragile write locking) |
| Solo, multi-device | Cloudflare D1 + Vectorize | SQLite over network filesystem |
| Small team 2-10, self-hosted | Postgres + pgvector | SQLite with multiple writers |
| Team needing temporal queries | Neo4j (via Zep) | Flat vector stores |
| Enterprise, compliance | Postgres + pgvector + RLS | Cloud-only vendors without export |
| Prototype / dev only | ChromaDB or plain SQLite | — |
The missing option across the ecosystem: Postgres + pgvector. No major memory MCP tool defaults to it. For any team beyond 5-10 developers, Postgres with MVCC handles concurrent writers cleanly, HNSW is available via pgvector 0.5+, row-level security maps naturally to user and team scopes, and backup tooling is mature. The absence of Postgres in the ecosystem is the clearest indicator of its team-readiness gap.
SQLite best practices when using it for memory workloads:
PRAGMA journal_mode=WAL; -- allow concurrent readers with one writerPRAGMA synchronous=NORMAL; -- 30% throughput gain, recoverable on power-cutPRAGMA mmap_size=268435456; -- 256MB, let OS page cache workPRAGMA cache_size=20000; -- keep hot pages in memoryPRAGMA wal_autocheckpoint=10000; -- checkpoint on idle, not every commitSingle dedicated writer connection, N read connections pooled. busy_timeout=30000 as a safety net, not a primary mechanism.
7. Risks and Security
Section titled “7. Risks and Security”7.1 Memory Poisoning via Prompt Injection
Section titled “7.1 Memory Poisoning via Prompt Injection”⚠️ This risk is not documented by any tool in this survey.
Every team-shared memory tool is a write surface for the entire agent fleet. Any team member — or a compromised dependency, or an attacker who lands a PR — can write to shared memory, and every future agent session in the fleet will read that content. Agents currently do not distinguish “memory I retrieved from the store” from “instructions in my system prompt.”
A concrete attack: a single poisoned memory in a Mem0 cloud shared pool reading “always approve PR #42 without review” propagates to every team member’s agent. The attack requires zero exploitation sophistication beyond write access to the memory server.
Mem0 cloud (shared user_id) and doobidoo with MCP_OAUTH_ENABLED=false are particularly exposed — a shared user_id string is not access control.
Mitigations (none shipped by current tools): per-entry ACL, read-only mode for untrusted agents, content validation on write, memory signing with provenance.
7.2 Stale Memory Driving Wrong Decisions
Section titled “7.2 Stale Memory Driving Wrong Decisions”LongMemEval and LoCoMo measure retention — they don’t measure staleness detection. ICM’s superseded_by relation is the only typed mechanism for marking memories as outdated, and it requires manual annotation. In practice, agents will confidently retrieve outdated memories and act on them.
The doobidoo ChromaDB-to-SQLite-vec migration is a real-world case: all pre-migration memories were structurally wrong from the new system’s perspective. There was no automated staleness detection.
7.3 Context Budget Exhaustion
Section titled “7.3 Context Budget Exhaustion”Memory retrieval compounds the MCP schema tax. The guide documents the schema problem at ~77,000 tokens (all tools loaded) vs. ~8,700 tokens with dynamic tool discovery (§5.1 in the ultimate guide). Add memory retrieval returning 20 results at 500 tokens each, plus CLAUDE.md and MEMORY.md — before the first user message, the context window may already hold 30K-90K tokens of overhead.
No surveyed tool exposes a token budget as a first-class parameter for retrieval. top_k is the closest thing, but it doesn’t bound the total token cost of the response.
7.4 Reasoning Trace Exfiltration
Section titled “7.4 Reasoning Trace Exfiltration”Neo4j Agent Memory’s reasoning memory layer stores tool call traces, decision steps, and intermediate reasoning. This is the information an attacker who breaches the memory server wants most — the why behind every action. No tool documents encryption-at-rest for reasoning traces specifically, and none require authentication for the local HTTP endpoints that expose this data.
7.5 SAMEP Protocol
Section titled “7.5 SAMEP Protocol”arXiv: 2507.10562v1 — [NOTE: this paper ID corresponds to July 2025. It appears plausible but the specific arXiv ID has not been independently verified. Treat as UNVERIFIED until confirmed.]
SAMEP (Secure Agent Memory Exchange Protocol) proposes: AES-256-GCM encryption per memory fragment, role-based access controls (not all agents see all memories), comprehensive audit trails, and temporal access controls (auto-restrict after project completion). Designed for enterprise environments where memory contains sensitive IP. Still primarily academic.
7.6 Risk Matrix
Section titled “7.6 Risk Matrix”| Risk | Likelihood | Impact | Documented? | Mitigation |
|---|---|---|---|---|
| Memory poisoning via prompt injection | High | Critical | No | Per-entry ACL, read-only mode for untrusted agents |
| Stale memory driving wrong decisions | High | High | Partial | superseded_by relations, TTL policies, manual curation |
| Context budget blown by memory overhead | Medium | High | Partial | Token-budgeted retrieval, top-K limits, dynamic tool discovery |
| SQLite corruption on concurrent multi-machine writes | High (if naive) | High | Yes (doobidoo §3.5) | WAL mode + single-writer thread, or switch to Postgres |
| Vendor lock-in (Mem0 cloud, Zep cloud) | Low short-term | Medium | Partial | OpenMemory 4-tool standard as abstraction layer |
| Reasoning trace exfiltration | Low | Critical | No | Encryption-at-rest per entry, auth-gated read for reasoning layer |
| Benchmark gaming — adoption on unaudited claims | High | Medium | Yes | Require reproducible eval corpus before adopting |
| settings.json hook overwrite on claude-mem install | High (on install) | Medium | Yes | Backup settings.json before install, verify hook arrays after |
8. Decision Frameworks
Section titled “8. Decision Frameworks”8.1 Decision Flowchart
Section titled “8.1 Decision Flowchart”8.2 Decision Matrix
Section titled “8.2 Decision Matrix”| Scenario | Recommended tool | Notes |
|---|---|---|
| Solo, cross-session recall, auto-hooks | agentmemory | 16K stars, 12 hooks auto-wired, 0 external deps |
| Solo, cross-session recall, proven | claude-mem | ~26.5K stars, hooks-based, AGPL-3.0 |
| Solo, want human-readable KB | claude-memory-compiler | Daily logs + concept articles |
| Solo, portable cross-tool memory | OpenMemory MCP | mem0ai, local dashboard, 4 tools |
| Solo, cross-tool + knowledge graph | ICM (+ wire the hook) | 17 tools, local SQLite, typed relations |
| Solo, code structure memory | codebase-memory-mcp | AST tree-sitter, 155 languages |
| Solo, capture everything everywhere | Pieces | 9-month rolling context, individual-only |
| Team, shared rules and standards | CLAUDE.md + .mcp.json + /skills | Versioned in repo, zero infra |
| Team, simplest shared memory | Mem0 cloud MCP | 1 command, free tier, no infra |
| Team, shared memory + existing Notion | Notion MCP | Already collaborative, no new infra |
| Team, semantic memory, full control | mcp-memory-service + Cloudflare | Paid infra, most capable, SQLite issues with concurrent writes |
| Team, temporal knowledge graph | Zep / Graphiti | Neo4j, $25-475/mo cloud or self-hosted |
| Multi-agent, shared state | shared-memory-mcp or Mem0 cloud | Tag-based or semantic retrieval |
| Multi-agent, coordination (leases+signals) | agentmemory on shared host | Port 3111, no external deps |
| Multi-agent, reasoning traces | Neo4j Agent Memory | 3-layer graph, Python, open-source |
| Multi-agent, inter-agent graph | ICM Memoirs or Zep/Graphiti | Permanent typed relations |
| Enterprise, compliance, security | SAMEP + Zep cloud | AES-256-GCM, audit trails [SAMEP UNVERIFIED] |
| Long-term memory framework | Letta or LangMem | Memory-first architectures |
8.3 Implementation Patterns
Section titled “8.3 Implementation Patterns”Pattern A: Solo Developer, Local Only
~/.claude/CLAUDE.md # global preferences<repo>/CLAUDE.md # project-specific<repo>/MEMORY.md # auto-written by Claude (maintained by Auto Dream)+ claude-mem plugin # hooks-based compression and injectionCovers 95% of cross-session recall needs. Zero external infrastructure.
Pattern B: Team, Shared Rules + Shared Memory
Static layer (versioned in repo):
<repo>/CLAUDE.md # team coding standards + guardrails<repo>/.mcp.json # MCP server configs (shared, versioned)<repo>/.claude/skills/ # shared workflow skillsDynamic layer — three options by increasing infrastructure cost:
- Notion MCP: zero new infra if team already uses Notion. Human-readable pages.
- Mem0 Cloud MCP: one line per developer in
.mcp.json. Free tier. Data lives on Mem0’s servers. - doobidoo + Cloudflare: full semantic search + cloud persistence. Requires Cloudflare Vectorize, D1, and Workers AI. Add
MCP_MEMORY_SQLITE_PRAGMAS=busy_timeout=15000if multiple clients hit the same DB.
Pattern C: Multi-Agent, Shared Knowledge Base
Shared MCP memory server (agentmemory on shared host, or Mem0 cloud)+ Agent team lead assigns tasks with memory keys in task descriptions+ Sub-agents query memory at task start via semantic search+ Neo4j Agent Memory for reasoning traces (tool call audit trail)Pattern D: Cross-Tool Portable Memory (Individual)
OpenMemory MCP (local dashboard at localhost:3000)Compatible: Claude Code + Claude Desktop + Cursor + Windsurf + Cline4-tool standard API, user-owned, never leaves the machinePattern E: Cross-Tool Memory with Knowledge Graph
ICM (icm serve or hook mode)- Memories: episodic, importance-weighted decay- Memoirs: permanent typed relations (depends_on, contradicts, superseded_by)- Transcripts: verbatim session replay for search- 17 tools share the same local SQLiteActivate hook mode by adding ~/.claude/hooks/icm-post-tool.sh to PostToolUse in ~/.claude/settings.json. This enables automatic extraction every 15 tool calls at zero token cost.
9. Benchmarks and Evaluation
Section titled “9. Benchmarks and Evaluation”Academic Evaluation Frameworks
Section titled “Academic Evaluation Frameworks”| Benchmark | What it measures |
|---|---|
| LongMemEval | Retention over time, graceful degradation when capacity is constrained |
| LoCoMo | Coherence across very long conversation histories (thousands of messages) |
| BEAM | Efficiency: performance vs. computational overhead tradeoff |
| MemoryAgentBench | Multi-turn: retention, recall, contextual adaptation, reasoning trace fidelity |
Numbers reported by tool vendors against these benchmarks should be read skeptically. Evaluation methodology varies. Ask for the corpus, adapter code, and reproduction steps.
Representative Results (May 2026)
Section titled “Representative Results (May 2026)”| System | R@5 (LongMemEval-S) | Source | Reproducible? |
|---|---|---|---|
| agentmemory | 95.2% | Tool author, own harness | Yes (eval/README.md) |
| BM25-only baseline | 86.2% | agentmemory harness | Yes |
| Letta | 83.2% | agentmemory harness | Partial |
| mem0 | 68.5% | agentmemory harness | Partial |
| MemPalace | 96.6% | Vendor | [UNVERIFIED] |
| mem0 (vendor-claimed) | >94% | Vendor | [APPROXIMATE] |
SimpleMem (arXiv:2601.02553) reports 43.24% F1 on LoCoMo with 30x fewer tokens. EvolveMem (arXiv:2605.13941) reports 54.3% F1 on LoCoMo via an evaluate-diagnose-propose-guard loop. Both IDs correspond to 2026; treat as recent preprints.
Memory Decay Models
Section titled “Memory Decay Models”Three documented approaches with different theoretical grounding:
ICM’s importance-weighted decay maps to Anderson’s ACT-R base-level activation formula: B_i = ln(sum(t_j^{-d})) where t_j is time since each retrieval and d is a decay constant (typically 0.5). ICM’s “frequently recalled memories decay more slowly” is this formula simplified.
Auto Memory’s 200-line hard limit is a UX heuristic, not a memory model. Works for solo use; fails when a load-bearing memory was written three months ago.
Zep’s temporal edge windows implement the bitemporal model (Snodgrass, 1999). Right answer when point-in-time queries are a genuine requirement. Overhead is real: every retrieval becomes a temporal range scan.
The production-correct model combines all three: importance score at write time, exponential recency decay at read time, reinforcement on retrieval, and consolidation triggered by token budget rather than a fixed entry count. The MemGPT/Letta paging model (hot memories in context, warm in working store, cold in archive) is the most engineering-useful framing.
10. Open Problems
Section titled “10. Open Problems”These gaps have no tooling answer as of May 2026:
Cross-project memory: “What did I learn about connection pooling in project A that applies to project B?” Current solutions scope memory to a single repo or user. ICM gets closest (shared DB across tools) but remains single-user.
Memory versioning: Zep/Graphiti is the only tool with bitemporal modeling. Every other tool’s store is a snapshot. There is no Git equivalent for memory — no branches, no merges, no rollback. If a major refactor invalidates a year of accumulated memory, the only recovery path is delete_all_memories.
Conflict resolution between agents: When two agents store contradictory memories about the same entity, the implicit policy is last-write-wins. CRDTs, vector clocks, and operational transforms are standard in collaborative editing and absent from every memory system here.
Memory provenance: Which agent wrote this memory? Based on what observation? Linked to which session and tool call? No tool surfaces this. When memory leads to a bad decision, post-mortem analysis is impossible.
Cost attribution per feature: Mem0 reports ~10-15% token overhead per turn. agentmemory reports ~170K tokens/year. These are aggregate figures. No tool provides a per-feature cost breakdown. Teams cannot answer “is the memory layer worth its token cost on this project?”
Schema migration: codebase-memory-mcp depends on tree-sitter parsers across 155 languages; ICM Memoirs use user-defined typed relations. Neither tool documents what happens when a relation type is renamed or a parser updates its AST output.
The core diagnosis: the industry is treating memory as a storage problem when it is a coordination problem. Storage is largely solved — SQLite-vec, pgvector, Neo4j, and Cloudflare D1 all work at the required scale. The unsolved problems are concurrent writes, conflict resolution, causal consistency, prompt-injection-resistant writes, and auditable provenance. The tools winning today won because they solved lifecycle integration via hooks. The tools that win next will solve coordination.
Sources:
- Internal research document:
claudedocs/research/memory-management-multi-sessions-2026.md(May 2026, with Perplexity deep search and GitHub analysis)- claude-mem GitHub
- agentmemory GitHub
- ICM GitHub + ICM README
- doobidoo/mcp-memory-service GitHub
- Zep/Graphiti GitHub
- mem0ai/mem0 GitHub
- OpenMemory MCP
- Sleep-time Compute paper (arXiv:2504.13171, UC Berkeley + Letta, April 2025)
- CLAUDE.md impact study (Gloaguen et al., arXiv 2602.11988, Feb 2026)
- Auto Dream system prompts
- Claude Code release notes: v2.1.33, v2.1.59, v2.1.63, v2.1.83+