Handle massive contexts (10M+ tokens) with chunking and local inference.
Massive Context MCP
Handle massive contexts (10M+ tokens) with chunking, sub-queries, and free local inference via Ollama.
flowchart TD
A[Claude Code] --> B[RLM MCP Server]
B --> C{rlm_ollama_status}
C -->|cached 60s| D{provider = auto}
D -->|Ollama running| E[🦙 Ollamagemma3:12b]
D -->|Ollama unavailable| F[☁️ Claude SDKclaude-haiku-4-5]
E --> G[["💰 $0Free local inference"]]
F --> H[["💰 ~$0.80/1MCloud inference"]]
style A fill:#ff922b,color:#fff
style B fill:#339af0,color:#fff
style E fill:#51cf66,color:#fff
style F fill:#748ffc,color:#fff
style G fill:#51cf66,color:#fff
style H fill:#748ffc,color:#fff
Based on the Recursive Language Model pattern. Inspired by richardwhiteii/rlm.
📸 Screenshots

Core Idea
Instead of feeding massive contexts directly into the LLM:
- Load context as external variable (stays out of prompt)
- Inspect structure programmatically
- Chunk strategically (lines, chars, or paragraphs)
- Sub-query recursively on chunks
- Aggregate results for final synthesis
Quick Start
Installation
Option 1: PyPI (Recommended)
uvx massive-context-mcp
# or
pip install massive-context-mcp
With Optional Extras:
# With Code Firewall integration (security filter for rlm_exec)
pip install massive-context-mcp[firewall]
# With Claude Agent SDK (for programmatic Claude API access)
pip install massive-context-mcp[claude]
# With all extras
pip install massive-context-mcp[firewall,claude]
Option 2: Claude Desktop One-Click
Download the .mcpb from Releases and double-click to install.
Option 3: From Source
git clone https://github.com/egoughnour/massive-context-mcp.git
cd massive-context-mcp
uv sync
Wire to Claude Code / Claude Desktop
Add to ~/.claude/.mcp.json (Claude Code) or claude_desktop_config.json (Claude Desktop):
{
"mcpServers": {
"massive-context": {
"command": "uvx",
"args": ["massive-context-mcp"],
"env": {
"RLM_DATA_DIR": "~/.rlm-data",
"OLLAMA_URL": "http://localhost:11434"
}
}
}
}
Tools
Setup & Status Tools
| Tool | Purpose |
|---|---|
rlm_system_check |
Check system requirements — verify macOS, Apple Silicon, 16GB+ RAM, Homebrew |
rlm_setup_ollama |
Install via Homebrew — managed service, auto-updates, requires Homebrew |
rlm_setup_ollama_direct |
Install via direct download — no sudo, fully headless, works on locked-down machines |
rlm_ollama_status |
Check Ollama availability — detect if free local inference is available |
Analysis Tools
| Tool | Purpose |
|---|---|
rlm_auto_analyze |
One-step analysis — auto-detects type, chunks, and q |
Tools (5)
rlm_system_checkVerify system requirements including macOS, Apple Silicon, RAM, and Homebrew.rlm_setup_ollamaInstall Ollama via Homebrew with managed service and auto-updates.rlm_setup_ollama_directInstall Ollama via direct download for headless environments.rlm_ollama_statusCheck Ollama availability to detect if free local inference is available.rlm_auto_analyzePerform one-step analysis by auto-detecting type, chunking, and querying.Environment Variables
RLM_DATA_DIRDirectory path for storing RLM data.OLLAMA_URLURL for the Ollama inference service.Configuration
{"mcpServers": {"massive-context": {"command": "uvx", "args": ["massive-context-mcp"], "env": {"RLM_DATA_DIR": "~/.rlm-data", "OLLAMA_URL": "http://localhost:11434"}}}}