A memory engine for AI agents that remembers, forgets, and learns.
Alaya
A memory engine for AI agents that remembers, forgets, and learns.
Alaya (Sanskrit: alaya-vijnana, "storehouse consciousness") is an embeddable Rust library. One SQLite file. No external services. Your agent stores conversations, retrieves what matters, and lets the rest fade. The graph reshapes through use, like biological memory.
let store = AlayaStore::open("memory.db")?;
store.store_episode(&episode)?; // store
let results = store.query(&query)?; // retrieve
store.consolidate(&provider)?; // distill knowledge
store.transform()?; // dedup, LTD, discover categories
store.forget()?; // decay what's stale
let cats = store.categories(None)?; // emergent ontology
store.purge(PurgeFilter::Session("s1"))?; // cascade delete + tombstones
The Problem
Most AI agents treat memory as flat files. OpenClaw writes to MEMORY.md.
Claudesidian writes to Obsidian. Hand-rolled systems write to JSON or
Markdown. It works at first.
Then the files grow. Context windows fill. The agent dumps everything into the prompt and hopes the LLM finds what matters.
The cost is measurable. OpenClaw injects ~35,600 tokens of workspace files into every message, 93.5% of which is irrelevant (#9157). Heavy users report $3,600/month in token costs. Community tools like QMD and memsearch cut 70-96% of that waste by replacing full-context injection with ranked retrieval (Levine, 2026).
The structure problem compounds the cost. MEMORY.md conflates decisions,
preferences, and knowledge into one unstructured blob. Users independently
invent `decision.md`
files, working-context.md snapshots, and
12-layer memory architectures
to compensate. Monday you mention "Alice manages the auth team." Wednesday
you ask "who handles auth permissions?" The agent retrieves both memories
by text similarity but cannot connect them
(Chawla, 2026).
How Alaya Solves It
| Problem | File-based memory | Alaya |
|---|---|---|
| Token waste | Full-context injection (~35K tokens/message) | Ranked retrieval returns only top-k relevant memories |
| No structure | Everything in one file (users invent decision.md workarounds) |
Three typed stores: episodes, knowledge, preferences |
| No forgetting | Files grow until you manually curate | Bjork dual-strength decay: weak memories fade, strong ones persist |
| No associations | Flat files, no links between memories | Hebbian graph strengthens through co-retrieval; spreading activation finds indirect connections |
| Brittle preferences | Agent-authored summary, easily drifts | Preferences emerge from accumulated impressions, crystallize at threshold |
| LLM required | Can't function without one | Optional. No embeddings? BM25-only. No LLM? Episodes accumulate. Every feature works independently |
Getting Started
MCP Server (recommended for agents)
The fastest way to add Alaya memory to any MCP-compatible agent (Claude Desktop, Claude Code, Cursor, Cline, etc.):
Via npm (no Rust toolchain needed)
Add to your Claude Code config (~/.claude/claude_code_config.json):
{
"mcpServers": {
"alaya": {
Tools (4)
store_episodeStores a conversation episode into the local memory database.queryRetrieves relevant memories based on a query.consolidateDistills knowledge from stored episodes.forgetApplies decay to stale memories.Configuration
{"mcpServers": {"alaya": {"command": "npx", "args": ["-y", "alaya-mcp"]}}}