Code-aware observational memory MCP server for AI coding assistants.
codewatch-memory
Code-aware observational memory MCP server for AI coding assistants.
AI coding assistants (Claude Code, Cursor, Windsurf) suffer from session amnesia — context is lost on compaction or between sessions. Existing solutions are either framework-locked (Mastra's Observational Memory requires their agent framework) or simplistic (mcp-memory-keeper stores key-value pairs without intelligent compression).
codewatch-memory is an MCP-native server that implements observational memory specifically for coding workflows. It uses cheap LLMs (Groq, Gemini Flash) as Observer/Reflector agents to compress conversation context into a structured observation log, stored in SQLite, scoped per git branch.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Claude Code / Cursor / Windsurf │
│ (MCP Client) │
└──────┬──────────────────┬──────────────────────┬────────────────┘
│ │ │
[Stdout/Stdin] [Hook: Stop/PreCompact] [Hook: UserPromptSubmit
(MCP Tools) (save observations) /SessionStart]
│ │ (recall context)
v v │
┌────────────┐ ┌──────────────────┐ v
│ MCP Server │ │ Hook Subprocess │ ┌──────────────────┐
│(stdio mode)│ │ --hook mode │ │ Recall Subprocess│
└─────┬──────┘ └────────┬─────────┘ │ --recall mode │
│ │ └────────┬─────────┘
│ 5 Tools │ Transcript │ FTS5 search
│ observe/recall/ │ parsing & │ keyword extraction
│ reflect/ │ observation │ (no LLM, ~50ms)
│ get_session_info│ extraction │
│ switch_context │ │
v v v
┌──────────────────────────────────────────────────────┐
│ Agents Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Observer │ │Reflector │ │ Categorizer │ │
│ │ extract │ │ compress │ │ classify │ │
│ └────┬─────┘ └────┬─────┘ └──────┬───────┘ │
└───────┼──────────────┼───────────────┼───────────────┘
│ │ │
v v v
┌──────────────┐ ┌────────────────────────────┐
│ LLM Provider │ │ SQLite Database │
│ Groq (free) │ │ sessions / observations │
│ Google │ │ reflections / tasks │
│ OpenAI │ │ FTS5 full-text search │
└──────────────┘ └────────────────────────────┘
How It Works
The Core Loop
Capture → Categorize → Store → Search → Compress
Three modes of operation:
Hook mode (
--hook) — Fires automatically after every AI response (Stopevent) and before context compaction (PreCompact). Reads the last 20 messages from the conversation transcript, sends them to a cheap LLM, and extracts structured observations. No manual tool calls needed.Recall mode (
--recall) — Fires onUserPromptSubmit(every user prompt) andSessionStart(session start/resume/compact). Searches stored observations via FTS5 keyword extraction and injects relevant context. Pure database queries, no LLM — adds ~50ms latency.MCP server mode — Runs as an MCP server with 5 tools that the AI assistant can call directly for manual observation, recall, and compression.
Observation Flow
Hook fires on Stop/PreCompact
→ Read last 20 transcript messages (JSONL)
→ Skip if < 50 tokens (trivial turn)
→ Skip if already processed (hash dedup)
→ Send to Observer LLM agent
→ Extract observations with priority + category
→ Store each in SQLite (FTS5 auto-indexed)
→ Update session stats
→ Check auto-reflect threshold (default 40K tokens)
→ If over: run Reflector with escalating compression
Recall Flow (Automatic)
User types a prompt
→ UserPromptSubmit hook fires
→ Extract keywords from prompt (stop-word filtering, file paths, quoted phrases)
→ FTS5 search with OR query for broad recall
→ Fallback: individual keyword search → category heuristic
→ Inject matching observations as context (max ~1K tokens)
→ Claude sees relevant memories before processing the prompt
Recall Flow (Manual)
AI calls recall(query="authentication")
→ FTS5 full-text search on observations
→ Filter by category / priority / files / branch
→ Group by date with priority emojis
→ Include compressed reflections if requested
→ Include current task context
→ Return formatted observation log
Three Agents
Observer Agent
Extracts facts and decisions from AI-developer conversations. Runs frequently (every Stop event via hooks).
- Temperature: 0.3 (some creativity for phrasing, factual)
- Input: Last 20 conversation messages
- **O
Tools (5)
observeManually record an observation into the memory store.recallSearch and retrieve stored observations based on a query.reflectTrigger the reflection agent to compress memory logs.get_session_infoRetrieve metadata about the current coding session.switch_contextChange the active project context or branch.Environment Variables
LLM_PROVIDERrequiredThe LLM provider to use for observer/reflector agents (e.g., groq, google, openai).LLM_API_KEYrequiredAPI key for the selected LLM provider.Configuration
{"mcpServers": {"codewatch": {"command": "npx", "args": ["-y", "codewatch-memory"], "env": {"LLM_PROVIDER": "groq", "LLM_API_KEY": "your-api-key"}}}}