Turn your docs into a searchable knowledge base for AI agents.
Gnosis MCP
Turn your docs into a searchable knowledge base for AI agents.pip install, ingest, serve.
Quick Start · Git History · Web Crawl · Backends · Editors · Tools · Embeddings · Full Reference
Ingest docs → Search with highlights → Stats overview → Serve to AI agents
Without a docs server
- LLMs hallucinate API signatures that don't exist
- Entire files dumped into context — 3,000 to 8,000+ tokens each
- Architecture decisions buried across dozens of files
With Gnosis MCP
search_docsreturns ranked, highlighted excerpts (~600 tokens)- Real answers grounded in your actual documentation
- Works across hundreds of docs instantly
Features
- Zero config — SQLite by default,
pip installand go - Hybrid search — keyword (BM25) + semantic (local ONNX embeddings, no API key)
- Git history — ingest commit messages as searchable context (
ingest-git) - Web crawl — ingest documentation from any website via sitemap or link crawl
- Multi-format —
.md.txt.ipynb.toml.csv.json+ optional.rst.pdf - Auto-linking —
relates_tofrontmatter creates a navigable document graph - Watch mode — auto-re-ingest on file changes
- PostgreSQL ready — pgvector + tsvector when you need scale
Quick Start
pip install gnosis-mcp
gnosis-mcp ingest ./docs/ # loads docs into SQLite (auto-created)
gnosis-mcp serve # starts MCP server
That's it. Your AI agent can now search your docs.
Want semantic search? Add local embeddings — no API key needed:
pip install gnosis-mcp[embeddings]
gnosis-mcp ingest ./docs/ --embed # ingest + embed in one step
gnosis-mcp serve # hybrid search auto-activated
Test it before connecting to an editor:
gnosis-mcp search "getting started" # keyword search
gnosis-mcp search "how does auth work" --embed # hybrid semantic+keyword
gnosis-mcp stats # see what was indexed
Try without installing (uvx)
uvx gnosis-mcp ingest ./docs/
uvx gnosis-mcp serve
Web Crawl
Dry-run discovery → Crawl & ingest → Search crawled docs → SSRF protection
Ingest docs from any website — no local files needed:
pip install gnosis-mcp[web]
# Crawl via sitemap (best for large doc sites)
gnosis-mcp crawl https://docs.stripe.com/ --sitemap
# Depth-limited link crawl with URL filter
gnosis-mcp crawl https://fastapi.tiangolo.com/ --depth 2 --include "/tutorial/*"
# Preview what would be crawled
gnosis-mcp crawl https://docs.python.org/ --dry-run
# Force re-crawl + embed for semantic search
gnosis-mcp crawl https://docs.sveltekit.dev/ --sitemap --force --embed
Respects robots.txt, caches with ETag/Last-Modified for incremental re-crawl, and rate-limits requests (5 concurrent, 0.2s delay). Crawled pages use the URL as the document path and hostname as the category — searchable like any other doc.
Git History
Turn commit messages into searchable context — your agent learns why things were built, not just what exists:
gnosis-mcp ingest-git . # current repo, all files
gnosis-mcp ingest-git /path/to/repo --since 6m # last 6 months only
gnosis-mcp ingest-git . --include "src/*" --max-commits 5 # filtered + limited
gnosis-mcp ingest-git . --dry-run
Tools (5)
search_docsPerforms keyword or semantic search across indexed documentation to retrieve ranked, highlighted excerpts.ingestLoads documentation files from a local directory into the searchable database.ingest-gitIngests git commit messages as searchable context.crawlIngests documentation from a website via sitemap or link crawl.statsProvides an overview of indexed documentation and database status.Configuration
{"mcpServers": {"gnosis": {"command": "gnosis-mcp", "args": ["serve"]}}}