VAHAN Data
Scrape India's national vehicle registration database (VAHAN Dashboard) and serve it as a Model Context Protocol (MCP) server — queryable by Claude and any MCP-compatible client.
Data coverage
- 36 states / union territories (including All India)
- Available for combimations of X-Y axes (year and state level data)
- Available X-Y axes are:
- X-axis: Fuel, Maker, Vehicle Class, Vehicle Category Group, Norms, Month Wise
- Y-axis: Fuel, Maker, Vehicle Class, Vehicle Category Group, Norms
1. Setup
Requirements: Python 3.11+, Node.js (for mcp-remote / Claude Desktop)
# Clone / enter the project directory
cd vahandata
# Create virtual environment
python3 -m venv .venv
# Install Python dependencies
.venv/bin/pip install playwright>=1.44.0 pandas>=2.2.0 openpyxl mcp uvicorn starlette
# Install Playwright browsers (needed for scraping only)
.venv/bin/playwright install chromium
2. Scraping
The project uses a unified Playwright-based scraper to collect data from the VAHAN Dashboard.
`vahan_scraper.py` — Unified Scraper
The project uses a unified Playwright-based scraper to collect data from the VAHAN Dashboard. It automates selection, extraction, and consolidation of multi-dimensional registration data.
Scraper Usage
--year is the only mandatory parameter. By default, the scraper iterates through multiple X/Y axis combinations and all Indian states.
# Full extraction (Broad defaults: Multiple X/Y combinations, all states)
.venv/bin/python scraping/vahan_scraper.py --year 2025
# Targeted extraction: Maker x Fuel breakdown for specific states
.venv/bin/python scraping/vahan_scraper.py \
--year 2025 \
--state "DELHI" "HARYANA" \
--xaxis "Fuel" \
--yaxis "Maker"
Naming Scheme: Generated files follow the pattern data/[xaxis]_[yaxis]_[year].csv (e.g., data/Fuel_Maker_2025.csv). Each file contains consolidated registration data across all states.
| Flag | Default | Description |
|---|---|---|
--year |
(Required) | Calendar year to scrape. |
--state |
ALL |
List of states to scrape. Fetches all ~36 states if omitted. |
--xaxis |
["Month Wise", "Fuel", "Norms"] |
One or more X-Axis variables to scrape. |
--yaxis |
["Vehicle Class", "Maker", "Fuel"] |
One or more Y-Axis variables to scrape. |
--out |
data |
Output directory where CSVs are saved. |
3. MCP Server
mcp_server.py reads the scraped CSVs, builds a SQLite database (db/vahan.db) on first run, and exposes the data as MCP tools and resources.
.venv/bin/python3 mcp_server.py [--transport {stdio|http}] [--host HOST] [--port PORT]
| Flag | Default | Description |
|---|---|---|
--transport |
stdio |
Transport to use. stdio for local Claude Desktop use; http for web hosting. |
--host |
0.0.0.0 |
Host to bind to (HTTP transport only). Use 127.0.0.1 when behind a reverse proxy. |
--port |
8000 |
Port to listen on (HTTP transport only). |
stdio transport (local)
Default mode — launched by Claude Desktop directly over stdin/stdout. No network port opened.
.venv/bin/python3 mcp_server.py
# or explicitly:
.venv/bin/python3 mcp_server.py --transport stdio
HTTP transport (web)
Runs a Streamable HTTP server. MCP endpoint: http://<host>:/mcp
# Bind to all interfaces (public VPS)
.venv/bin/python3 mcp_server.py --transport http
# Bind to localhost only (behind a reverse proxy)
.venv/bin/python3 mcp_server.py --transport http --host 127.0.0.1
# Custom port
.venv/bin/python3 mcp_server.py --transport http --host 127.0.0.1 --port 9000
4. Hosting
The server is configured to run behind an Nginx reverse proxy on a custom domain (vahanmcp.shubhamgrg.com).
- Reverse Proxy: Nginx listens on port 80/443 and forwards traffic to
127.0.0.1:8000. - SSL: Automated via Certbot (Let's Encrypt).
This project includes vahan-mcp.nginx and an automated deploy.sh script to handle the configuration.
5. Connecting Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json.
Tools 1
query_vahan_dataQueries the consolidated VAHAN vehicle registration database for specific metrics.