memtomem v0.3.0 · memtomem-stm v0.1.29 · Apache 2.0

AI That Never Forgets —
Memory Infrastructure for Agents

Stop re-explaining your project to your AI every session. memtomem turns your notes, docs, and code into a searchable memory that any MCP-compatible agent can use — across sessions, across agents, all on your machine.

$ uv tool install 'memtomem[all]'
87LTM MCP Tools
8Compression Strategies
13STM Tools
Why Do Agents Forget?
Tool integration (MCP), safety (Guardrails), and observability (Langfuse) are mature — but the memory layer still has no standard.
01

No Memory Between Sessions

All context is lost when a session ends. Architecture decisions, coding patterns, and debugging history must be re-explained every time.

02

Memory Silos Between Agents

Knowledge from Claude Code can't be carried over to Cursor. Each agent is trapped in its own isolated memory silo.

03

Limitations of Existing Solutions

Current memory systems only work when agents explicitly search, are locked to specific runtimes, and offer only a single LTM layer.

memtomem Solves This
Applying the cognitive science working-memory / long-term memory model to agents. Short-term compression and long-term search as independent MCP servers.

Searchable Long-Term Memory

Index your notes, docs, and code with mm index, then find them with hybrid search — BM25 keyword and dense-vector semantic search fused via RRF, so exact identifiers and meaning-based queries both land. Markdown, code, and structured files are chunked by structure, and re-indexing only re-embeds the chunks that changed.

Memory Across Sessions & Agents

Memory doesn't vanish when a session ends. Namespaces split each agent's private space from a shared space, and the session workflow lets one agent pick up what another already worked out. Claude Code, Cursor, and Codex all share one memory store.

Context Gateway

Sync skills, commands, and subagents from one canonical .memtomem/ source out to every AI runtime. Handle per-row Sync/Import in the mm web Simple view, move artifacts across projects and tiers with mm context copy/move, and bulk-sync many projects with mm context sync --all-projects.

Proactive Surfacing

Your agent doesn't have to ask. STM observes the MCP calls it proxies and surfaces relevant memories at the right moment. Each surfaced memory carries an id, so your agent can rate or invalidate individual items.

Token-Aware Compression

Every MCP tool response passes through STM before it reaches your agent. When a response exceeds the context budget, one of 8 strategies is auto-selected by content type to cut tokens. The active query shapes the budget — relevant sections get more room, so the information your agent needs is preserved.

Fully Local & Private

SQLite + ONNX under the hood — no GPU, no external API, no cloud. SQLite files are kept at 0600, secret-looking responses are never cached, and secrets are never pushed out. The STM proxy is fully reversible with mms eject, so there's no lock-in.

Two-Layer Architecture
STM proxy and LTM server connected via MCP, transparently providing surfacing and compression to agents.
Core AI Runtimes / Others
Claude Code
Codex CLI
Antigravity CLI
Other MCP clients
MCP
memtomem-stm
STM Proxy
CLEAN → COMPRESS → SURFACE → (INDEX)
Surfacing
MCP
memtomem
LTM Server
87 MCP Tools
Upstream MCP Servers
filesystem, GitHub, …
Core Runtimes & Compatibility
Optimized for the main CLI runtimes, while staying MCP-native for the rest.
Claude Code
Codex CLI
Antigravity CLI
Other MCP clients
Framework adapters
Docs & Tutorials
From getting started to advanced usage.

Get Started Now

No GPU. No external services. One uv install is all you need.