Claude MCP Data Engineer MCP Server

Local setup required. This server has to be cloned and prepared on your machine before you register it in Claude Code.
1

Set the server up locally

Run this once to clone and prepare the server before adding it to Claude Code.

Run in terminal
pip install -r requirements.txt
2

Register it in Claude Code

After the local setup is done, run this command to point Claude Code at the built server.

Run in terminal
claude mcp add data-engineer -- node "<FULL_PATH_TO_CLAUDE_MCP_DATA_ENGINEER>/dist/index.js"

Replace <FULL_PATH_TO_CLAUDE_MCP_DATA_ENGINEER>/dist/index.js with the actual folder you prepared in step 1.

README.md

Powerful tools for SQL, DBT, Snowflake, CSV analysis, and ETL lineage.

Claude MCP Data Engineer Server

A Model Context Protocol (MCP) server built for data engineers — gives Claude powerful tools to help with SQL, DBT, Snowflake, CSV analysis, pipeline validation, and ETL lineage.

Tools Available

Tool Description
format_sql Format & prettify SQL queries with proper indentation
json_to_ddl Generate CREATE TABLE DDL from a JSON sample record
analyze_csv Analyze CSV data — row count, column types, null counts
generate_dbt_model Generate a dbt model SQL file with source + deduplication CTE
validate_pipeline_config Validate a data pipeline JSON config for required fields
generate_snowflake_table Generate a Snowflake CREATE TABLE with optional CLUSTER BY
summarize_etl_lineage Summarize ETL lineage JSON into a human-readable pipeline flow

Setup

1. Install dependencies

pip install -r requirements.txt

2. Run the MCP server

python -m src.server

3. Connect to Claude Desktop

Add this to your claude_desktop_config.json (~/AppData/Roaming/Claude/ on Windows):

{
  "mcpServers": {
    "data-engineer": {
      "command": "python",
      "args": ["-m", "src.server"],
      "cwd": "C:/Users/Nikhil/claude-mcp-data-engineer"
    }
  }
}

Example Usage

Format SQL

Ask Claude:

"Format this SQL: select id,name from users where status='active' group by id"

Generate DDL from JSON

Ask Claude:

"Generate a CREATE TABLE DDL for this JSON: {"id": 1, "name": "Nikhil", "salary": 95000.0, "joined": "2024-01-15"}"

Analyze CSV

Ask Claude:

"Analyze this CSV data and tell me column types and null counts"

Generate DBT Model

Ask Claude:

"Generate a dbt model for source 'raw', table 'orders', columns: id, customer_id, amount, created_at"

Validate Pipeline Config

Ask Claude:

"Validate this pipeline config: {"source": {"type": "snowflake"}, "destination": {"type": "s3"}, "schedule": "0 2 * * *"}"

Tech Stack

  • Python 3.10+
  • MCP SDK (mcp[cli])
  • FastMCP for server definition

Author

Nikhil E — Sr. Data Engineer | BI Architect GitHub: itsnikhile

Tools (7)

format_sqlFormat & prettify SQL queries with proper indentation
json_to_ddlGenerate CREATE TABLE DDL from a JSON sample record
analyze_csvAnalyze CSV data — row count, column types, null counts
generate_dbt_modelGenerate a dbt model SQL file with source + deduplication CTE
validate_pipeline_configValidate a data pipeline JSON config for required fields
generate_snowflake_tableGenerate a Snowflake CREATE TABLE with optional CLUSTER BY
summarize_etl_lineageSummarize ETL lineage JSON into a human-readable pipeline flow

Configuration

claude_desktop_config.json
{"mcpServers": {"data-engineer": {"command": "python", "args": ["-m", "src.server"]}}}

Try it

Format this SQL: select id,name from users where status='active' group by id
Generate a CREATE TABLE DDL for this JSON: {"id": 1, "name": "Nikhil", "salary": 95000.0, "joined": "2024-01-15"}
Analyze this CSV data and tell me column types and null counts
Generate a dbt model for source 'raw', table 'orders', columns: id, customer_id, amount, created_at
Validate this pipeline config: {"source": {"type": "snowflake"}, "destination": {"type": "s3"}, "schedule": "0 2 * * *"}

Frequently Asked Questions

What are the key features of Claude MCP Data Engineer?

SQL query formatting and prettification. Automated dbt model generation with deduplication CTEs. Snowflake DDL generation with clustering support. CSV data profiling including row counts and null analysis. ETL pipeline configuration validation and lineage summarization.

What can I use Claude MCP Data Engineer for?

Quickly generating boilerplate dbt models from raw source definitions. Validating complex JSON pipeline configurations before deployment. Standardizing SQL query formatting across team documentation. Profiling CSV files to identify data quality issues like nulls. Documenting ETL lineage flows from raw JSON metadata.

How do I install Claude MCP Data Engineer?

Install Claude MCP Data Engineer by running: pip install -r requirements.txt

What MCP clients work with Claude MCP Data Engineer?

Claude MCP Data Engineer works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep Claude MCP Data Engineer docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare