Persistent memory for AI agents — organized by time and space.
engram-rs
Memory engine for AI agents. Two axes: time (three-layer decay & promotion) and space (self-organizing topic tree). Important memories get promoted, noise fades, related knowledge clusters automatically.
Most agent memory is a flat store — dump everything in, keyword search to get it back. No forgetting, no organization, no lifecycle. engram-rs adds the part that makes memory actually useful: the ability to forget what doesn't matter and surface what does.
Single Rust binary, one SQLite file, zero external dependencies. No Python, no Redis, no vector DB — curl | bash and it runs. ~10 MB binary, ~100 MB RSS, single-digit ms search latency.
Quick Start
# Install (interactive — will prompt for embedding provider config)
curl -fsSL https://raw.githubusercontent.com/kael-bit/engram-rs/main/install.sh | bash
# Store a memory
curl -X POST http://localhost:3917/memories \
-d '{"content": "Always run tests before deploying", "tags": ["deploy"]}'
# Recall by meaning
curl -X POST http://localhost:3917/recall \
-d '{"query": "deployment checklist"}'
# Restore full context (session start)
curl http://localhost:3917/resume
What It Does
Three-Layer Lifecycle
Inspired by the Atkinson–Shiffrin memory model, memories are managed across three layers by importance:
Buffer (short-term) → Working (active knowledge) → Core (long-term identity)
↓ ↓ ↑
eviction importance decay LLM quality gate
- Buffer: Entry point for all new memories. Temporary staging — evicted when below threshold
- Working: Promoted via consolidation. Never deleted, importance decays at different rates by kind
- Core: Promoted through LLM quality gate. Never deleted
LLM Quality Gate
Promotion isn't rule-based guesswork — an LLM evaluates each memory in context and decides whether it genuinely warrants long-term retention.
Buffer → [LLM gate: "Is this a decision, lesson, or preference?"] → Working
Working → [sustained access + LLM gate] → Core
Automatic Decay
Decay is activity-driven — it only fires during active consolidation cycles, not wall-clock time. If the system is idle, memories stay intact.
Exponential decay follows the Ebbinghaus forgetting curve — fast at first, then long-tail. Memories never fully vanish (floor = 0.01), remaining retrievable under precise queries. When a memory is recalled, it gets an activation boost, strengthening frequently-used knowledge.
| Kind | Decay rate | Half-life | Use case |
|---|---|---|---|
episodic |
Fastest | ~35 epochs | Events, experiences, time-bound context |
semantic |
Medium | ~58 epochs | Knowledge, preferences, lessons (default) |
procedural |
Slowest | ~173 epochs | Workflows, instructions, how-to |
Algorithm Visualizations
| Chart | What it shows |
|---|---|
| Sigmoid score compression. Raw scores are mapped through a sigmoid function, approaching 1.0 asymptotically. High-relevance results remain distinguishable instead of being crushed into the same value. | |
| Ebbinghaus forgetting curve. Exponential decay with kind-differentiated rates — episodic memories fade fastest, procedural slowest. Floor at 0.01 means memories never fully vanish; they remain retrievable under precise queries. | |
| Kind × layer weight bias. Additive biases adjust memory weight by type and layer. Procedural+core memories rank highest, episodic+buffer lowest — but the spread stays bounded so no single combination dominates. | |
| Reinforcement signals. Repetition and access bonuses follow logarithmic saturation. Early interactions matter most; later ones contribute diminishing returns, discriminating between "used occasionally" and "used daily". | |
| < |
Tools (3)
store_memoryStores a new memory with associated content and tags.recall_memoryRecalls memories based on semantic meaning and query relevance.resume_contextRestores full context for a session start.Environment Variables
EMBEDDING_PROVIDERrequiredConfigures the embedding provider for memory processing.Configuration
{"mcpServers": {"engram-rs": {"command": "engram-rs", "args": []}}}