Hybrid RAG Project MCP Server

Local setup required. This server has to be cloned and prepared on your machine before you register it in Claude Code.
1

Set the server up locally

Run this once to clone and prepare the server before adding it to Claude Code.

Run in terminal
git clone <your-repo-url>
cd hybrid-rag-project
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
2

Register it in Claude Code

After the local setup is done, run this command to point Claude Code at the built server.

Run in terminal
claude mcp add hybrid-rag-project -- python "<FULL_PATH_TO_HYBRID_RAG_PROJECT>/dist/index.js"

Replace <FULL_PATH_TO_HYBRID_RAG_PROJECT>/dist/index.js with the actual folder you prepared in step 1.

README.md

A generalized RAG system with hybrid search capabilities for any documents.

Hybrid RAG Project

A generalized Retrieval-Augmented Generation (RAG) system with hybrid search capabilities that works with any documents you provide. Combines semantic (dense vector) search and keyword (sparse BM25) search for optimal document retrieval, with an MCP server API for easy integration.

šŸŽÆ Key Features: Multi-format support • Local LLM • Claude Desktop integration • Structured data queries • Document-type-aware retrieval

šŸš€ Quick Start (No MCP Required!)

You don't need Claude Desktop or MCP to use this project! Just run:

# 1. Make sure Ollama is running
ollama serve

# 2. Activate virtual environment
source .venv/bin/activate

# 3. Start conversational demo (recommended)
python scripts/demos/conversational.py

# Or use the shortcut
./scripts/bin/ask.sh

That's it! Ask questions about the 43,835 document chunks in the sample dataset.

šŸ“– See Quick Start Guide for complete usage instructions. šŸ“š Browse all documentation in the docs/ folder or start with docs/README.md.


Overview

This project implements a hybrid RAG system that combines:

  • Semantic Search: Dense vector embeddings for understanding meaning and context
  • Keyword Search: BM25 sparse retrieval for exact keyword matching
  • Hybrid Fusion: Reciprocal Rank Fusion (RRF) to combine results from both methods
  • MCP Server: Both REST API and Model Context Protocol server for Claude integration
  • Multi-format Support: Automatically loads documents from various file formats

The hybrid approach ensures better retrieval accuracy by leveraging the strengths of both search methods.

Features

  • Vector-based semantic search using Chroma and Ollama embeddings
  • BM25 keyword search for exact term matching
  • Ensemble retriever with Reciprocal Rank Fusion (RRF)
  • Integration with local Ollama LLM for answer generation
  • Support for multiple document formats (TXT, PDF, MD, DOCX, CSV)
  • Automated document loading from data directory
  • RESTful API server with /ingest and /query endpoints
  • Model Context Protocol (MCP) server for Claude Desktop/API integration
  • Configuration-driven architecture (no hardcoded values)
  • Persistent vector store for faster subsequent queries

Architecture

User Documents → data/ directory
                      ↓
            Document Loader
                      ↓
Query → Hybrid Retriever → [Vector Retriever + BM25 Retriever]
                         → RRF Fusion
                         → Retrieved Context
                         → LLM (Ollama)
                         → Final Answer

Prerequisites

  1. Python 3.9+
  2. Ollama installed and running locally
  3. Required Ollama models:
    • llama3.1:latest (or another LLM model)
    • nomic-embed-text (or another embedding model)

Installing Ollama

Visit ollama.ai to download and install Ollama for your platform.

After installation, pull the required models:

ollama pull llama3.1:latest
ollama pull nomic-embed-text

Verify Ollama is running:

curl http://localhost:11434/api/tags

Installation

  1. Clone the repository:
git clone <your-repo-url>
cd hybrid-rag-project
  1. Create a virtual environment:
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Project Structure

hybrid-rag-project/
ā”œā”€ā”€ src/
│   └── hybrid_rag/            # Core application package
│       ā”œā”€ā”€ __init__.py        # Package initialization
│       ā”œā”€ā”€ document_loader.py # Document loading utility
│       ā”œā”€ā”€ structured_query.py# CSV query engine
│       └── utils.py           # Logging and utility functions
ā”œā”€ā”€ scripts/
│   ā”œā”€ā”€ run_demo.py            # Main demonstration script
│   ā”œā”€ā”€ mcp_server.py          # REST API server
│   └── mcp_server_claude.py   # MCP server for Claude integration
ā”œā”€ā”€ config/
│   ā”œā”€ā”€ config.yaml            # Configuration file
│   └── claude_desktop_config.json # Sample Claude Desktop MCP config
ā”œā”€ā”€ docs/
│   ā”œā”€ā”€ INSTALLATION.md        # Detailed installation guide
│   ā”œā”€ā”€ STRUCTURED_QUERIES.md  # CSV query documentation
│   ā”œā”€ā”€ ASYNC_INGESTION.md     # Async ingestion guide
│   └── SHUTDOWN.md            # Shutdown handling guide
ā”œā”€ā”€ data/                      # Sample data files (13 files included)
│   ā”œā”€ā”€ *.csv                  # 7 CSV files (structured data)
│   ā”œā”€ā”€ *.md                   # 5 Markdown files (unstructured)
│   └── *.txt                  # 1 Text file (technical specs)
ā”œā”€ā”€ chroma_db/

Tools (2)

queryPerforms a hybrid search across the document store using both semantic and keyword matching.
ingestIngests new documents from the data directory into the vector store.

Configuration

claude_desktop_config.json
{ "mcpServers": { "hybrid-rag": { "command": "python", "args": ["/path/to/hybrid-rag-project/scripts/mcp_server_claude.py"] } } }

Try it

→Search the local documents for information regarding the project's architecture.
→Find relevant documents about the installation process using keyword matching.
→Query the document store for details on how the hybrid fusion method works.

Frequently Asked Questions

What are the key features of Hybrid RAG Project?

Combines semantic vector search with BM25 keyword matching. Uses Reciprocal Rank Fusion (RRF) for optimal retrieval accuracy. Supports multiple file formats including TXT, PDF, MD, DOCX, and CSV. Integrates with local Ollama LLM for private document querying. Persistent vector store using Chroma for faster subsequent queries.

What can I use Hybrid RAG Project for?

Querying large collections of technical documentation for specific implementation details. Performing hybrid searches across mixed structured CSV data and unstructured markdown files. Building a private, local-only RAG pipeline for sensitive document analysis. Enhancing Claude's context with domain-specific knowledge from local files.

How do I install Hybrid RAG Project?

Install Hybrid RAG Project by running: git clone <your-repo-url> && cd hybrid-rag-project && python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt

What MCP clients work with Hybrid RAG Project?

Hybrid RAG Project works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep Hybrid RAG Project docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare