Codewatch Memory MCP Server

1

Add it to Claude Code

Run this in a terminal.

Run in terminal
claude mcp add -e "LLM_PROVIDER=${LLM_PROVIDER}" -e "LLM_API_KEY=${LLM_API_KEY}" codewatch-memory -- npx -y codewatch-memory
Required:LLM_PROVIDERLLM_API_KEY
README.md

Code-aware observational memory MCP server for AI coding assistants.

codewatch-memory

Code-aware observational memory MCP server for AI coding assistants.

AI coding assistants (Claude Code, Cursor, Windsurf) suffer from session amnesia — context is lost on compaction or between sessions. Existing solutions are either framework-locked (Mastra's Observational Memory requires their agent framework) or simplistic (mcp-memory-keeper stores key-value pairs without intelligent compression).

codewatch-memory is an MCP-native server that implements observational memory specifically for coding workflows. It uses cheap LLMs (Groq, Gemini Flash) as Observer/Reflector agents to compress conversation context into a structured observation log, stored in SQLite, scoped per git branch.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│               Claude Code / Cursor / Windsurf                   │
│                       (MCP Client)                              │
└──────┬──────────────────┬──────────────────────┬────────────────┘
       │                  │                      │
  [Stdout/Stdin]   [Hook: Stop/PreCompact]  [Hook: UserPromptSubmit
  (MCP Tools)       (save observations)      /SessionStart]
       │                  │                  (recall context)
       v                  v                      │
┌────────────┐  ┌──────────────────┐             v
│ MCP Server │  │ Hook Subprocess  │   ┌──────────────────┐
│(stdio mode)│  │   --hook mode    │   │ Recall Subprocess│
└─────┬──────┘  └────────┬─────────┘   │  --recall mode   │
      │                  │             └────────┬─────────┘
      │  5 Tools         │ Transcript           │ FTS5 search
      │  observe/recall/ │ parsing &            │ keyword extraction
      │  reflect/        │ observation          │ (no LLM, ~50ms)
      │  get_session_info│ extraction           │
      │  switch_context  │                      │
      v                  v                      v
┌──────────────────────────────────────────────────────┐
│                  Agents Layer                         │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐       │
│  │ Observer  │  │Reflector │  │ Categorizer  │       │
│  │ extract   │  │ compress │  │ classify     │       │
│  └────┬─────┘  └────┬─────┘  └──────┬───────┘       │
└───────┼──────────────┼───────────────┼───────────────┘
        │              │               │
        v              v               v
┌──────────────┐  ┌────────────────────────────┐
│ LLM Provider │  │     SQLite Database        │
│ Groq (free)  │  │  sessions / observations   │
│ Google       │  │  reflections / tasks       │
│ OpenAI       │  │  FTS5 full-text search     │
└──────────────┘  └────────────────────────────┘

How It Works

The Core Loop

Capture → Categorize → Store → Search → Compress

Three modes of operation:

  1. Hook mode (--hook) — Fires automatically after every AI response (Stop event) and before context compaction (PreCompact). Reads the last 20 messages from the conversation transcript, sends them to a cheap LLM, and extracts structured observations. No manual tool calls needed.

  2. Recall mode (--recall) — Fires on UserPromptSubmit (every user prompt) and SessionStart (session start/resume/compact). Searches stored observations via FTS5 keyword extraction and injects relevant context. Pure database queries, no LLM — adds ~50ms latency.

  3. MCP server mode — Runs as an MCP server with 5 tools that the AI assistant can call directly for manual observation, recall, and compression.

Observation Flow

Hook fires on Stop/PreCompact
  → Read last 20 transcript messages (JSONL)
  → Skip if < 50 tokens (trivial turn)
  → Skip if already processed (hash dedup)
  → Send to Observer LLM agent
  → Extract observations with priority + category
  → Store each in SQLite (FTS5 auto-indexed)
  → Update session stats
  → Check auto-reflect threshold (default 40K tokens)
     → If over: run Reflector with escalating compression

Recall Flow (Automatic)

User types a prompt
  → UserPromptSubmit hook fires
  → Extract keywords from prompt (stop-word filtering, file paths, quoted phrases)
  → FTS5 search with OR query for broad recall
  → Fallback: individual keyword search → category heuristic
  → Inject matching observations as context (max ~1K tokens)
  → Claude sees relevant memories before processing the prompt

Recall Flow (Manual)

AI calls recall(query="authentication")
  → FTS5 full-text search on observations
  → Filter by category / priority / files / branch
  → Group by date with priority emojis
  → Include compressed reflections if requested
  → Include current task context
  → Return formatted observation log

Three Agents

Observer Agent

Extracts facts and decisions from AI-developer conversations. Runs frequently (every Stop event via hooks).

  • Temperature: 0.3 (some creativity for phrasing, factual)
  • Input: Last 20 conversation messages
  • **O

Tools (5)

observeManually record an observation into the memory store.
recallSearch and retrieve stored observations based on a query.
reflectTrigger the reflection agent to compress memory logs.
get_session_infoRetrieve metadata about the current coding session.
switch_contextChange the active project context or branch.

Environment Variables

LLM_PROVIDERrequiredThe LLM provider to use for observer/reflector agents (e.g., groq, google, openai).
LLM_API_KEYrequiredAPI key for the selected LLM provider.

Configuration

claude_desktop_config.json
{"mcpServers": {"codewatch": {"command": "npx", "args": ["-y", "codewatch-memory"], "env": {"LLM_PROVIDER": "groq", "LLM_API_KEY": "your-api-key"}}}}

Try it

Recall any previous decisions we made regarding the authentication flow.
Observe that we decided to use Tailwind CSS for the new dashboard layout.
Reflect on our recent progress to compress the memory logs.
What was the last context we discussed for the user profile module?

Frequently Asked Questions

What are the key features of Codewatch Memory?

Automatic capture of project decisions via hook-based observation. Intelligent context compression using Observer and Reflector agents. FTS5-powered full-text search for rapid memory recall. Branch-scoped memory storage in SQLite. Automatic context injection on session start and user prompts.

What can I use Codewatch Memory for?

Maintaining architectural awareness across long-running coding sessions. Preventing context loss during AI conversation compaction. Quickly retrieving specific technical decisions made in previous git branches. Automating the documentation of project-specific implementation details.

How do I install Codewatch Memory?

Install Codewatch Memory by running: npx -y codewatch-memory

What MCP clients work with Codewatch Memory?

Codewatch Memory works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep Codewatch Memory docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare