Infinite Context Reasoning for Large Language Models
MCP-RLM
Recursive Language Model Agent
Infinite Context Reasoning for Large Language Models
Features • Installation • Configuration • Usage • Architecture
📋 Overview
MCP-RLM is an open-source implementation of the Recursive Language Models (RLMs) architecture introduced by researchers at MIT CSAIL (Zhang et al., 2025). It enables LLMs to process documents far beyond their context window limits through programmatic decomposition and recursive querying.
The Challenge
| Traditional LLM Approach | MCP-RLM Approach |
|---|---|
| ❌ Limited to 4K-128K token context windows | ✅ Handles 10M+ tokens seamlessly |
| ❌ Context degradation ("lost in the middle") | ✅ Maintains accuracy through chunked analysis |
| ❌ Expensive for long documents ($15/1M tokens) | ✅ Cost-effective ($3/1M tokens, 80% savings) |
| ❌ Single-pass processing bottleneck | ✅ Parallel recursive decomposition |
✨ Features
Core Capabilities
|
Technical Highlights
|
🏗 Architecture
MCP-RLM employs a two-tier agent system that separates strategic planning from execution:
graph TB
subgraph Input
A[User Query]
B[Large Document10M+ tokens]
end
subgraph "Root Agent (Planner)"
C[Analyze Metadata]
D[Generate Strategy]
E[Write Python Code]
end
subgraph "Execution Layer"
F[Python REPL]
G[Chunk Manager]
end
subgraph "Sub Agents (Workers)"
H1[Worker 1]
H2[Worker 2]
H3[Worker N]
end
subgraph Output
I[Aggregated Results]
J[Final Answer]
end
A --> C
B --> C
C --> D
D --> E
E --> F
F --> G
G --> H1
G --> H2
G --> H3
H1 --> I
H2 --> I
H3 --> I
I --> J
style A fill:#e3f2fd
style B fill:#e3f2fd
style C fill:#fff9c4
style D fill:#fff9c4
style E fill:#fff9c4
style F fill:#f3e5f5
style G fill:#f3e5f5
style H1 fill:#e8f5e9
style H2 fill:#e8f5e9
style H3 fill:#e8f5e9
style I fill:#fce4ec
style J fill:#fce4ec
Agent Roles
| Agent | Responsibility | Characteristics | Model Recommendations |
|---|---|---|---|
| Root Agent | Strategic planning and code generation | • Views metadata only• Generates Python strategies• Called 5-10 times per query | • Claude 3.5 Sonnet• GPT-4o• Mistral Large |
| Sub Agent | Chunk-level data extraction | • Reads small segments• Extracts specific info• Called 100-1000+ times | • GPT-4o-mini• Claude Haiku• Qwen 2.5 (free) |
🚀 Installation
Prerequisites
# Required
- Python 3.10 or higher
- pip package manager
# API Keys (choose at least one)
- OpenRouter API key (recommended for free tier)
- OpenAI API key
- Anthropic API key
- Ollama (for local deployment)
Quick Start
# Clone the repository
git clone https://github.com/MuhammadIndar/MCP-RLM.git
cd MCP-RLM
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Linux/macOS:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.EXAMPLE .env
# Edit .env with your API keys
# Start the server
python server.py
Expected Output:
MCP RLM Server Started...
Listening on stdio...
⚙ Configuration
1. Environment Setup
Copy the example environment file:
cp .env.EXAMPLE .env
Edit .env with your credentials:
# OpenRouter (Recommended - includes free tier)
OPENROUTER_API_KEY=sk-or-v1-xxxxx
# OpenAI Official
OPENAI
Environment Variables
OPENROUTER_API_KEYAPI key for OpenRouter (recommended for free tier)OPENAI_API_KEYOfficial OpenAI API keyANTHROPIC_API_KEYOfficial Anthropic API keyConfiguration
{
"mcpServers": {
"mcp-rlm": {
"command": "python",
"args": ["/path/to/MCP-RLM/server.py"],
"env": {
"OPENROUTER_API_KEY": "your-key-here"
}
}
}
}