Connect Claude to your Blacksmith CI data to query workflow runs and metrics.
Blacksmith MCP
An MCP server that connects Claude to your Blacksmith CI data. Query workflow runs, analyze test failures, detect flaky tests, and monitor usage—all through natural conversation.
Why?
Debugging CI failures usually means clicking through dashboards, copying run IDs, and piecing together information across multiple pages. With this MCP, you can just ask:
- "Why did the last CI run fail?"
- "Which tests are flaky this week?"
- "Compare test failures between main and my PR"
- "What's using the most cache storage?"
Claude handles the API calls and gives you actionable insights.
Quick Start
Zero-config if you're logged into Blacksmith in Chrome:
# Add to Claude Code
claude mcp add blacksmith -- npx blacksmith-mcp
# Set your org (run once)
export BLACKSMITH_ORG="your-org-name"
The MCP automatically extracts your session from Chrome cookies. No manual token copying needed.
Installation
Option 1: Claude Code CLI
claude mcp add blacksmith -- npx blacksmith-mcp
Option 2: Project Configuration
Add to your .mcp.json:
{
"mcpServers": {
"blacksmith": {
"type": "stdio",
"command": "npx",
"args": ["blacksmith-mcp"],
"env": {
"BLACKSMITH_ORG": "your-org-name"
}
}
}
}
Option 3: Global Install
npm install -g blacksmith-mcp
Configuration
Authentication
Automatic (recommended): Log into app.blacksmith.sh in Chrome. The MCP extracts your session cookie automatically.
Manual: Set BLACKSMITH_SESSION_COOKIE environment variable with your session cookie value.
Environment Variables
| Variable | Required | Description |
|---|---|---|
BLACKSMITH_ORG |
Yes | Your Blacksmith organization name |
BLACKSMITH_SESSION_COOKIE |
No | Session cookie (auto-extracted from Chrome if not set) |
Available Tools
Workflow Runs
| Tool | Description |
|---|---|
list_runs |
List workflow runs with filters (status, branch, workflow, actor, PR) |
get_run |
Get run details including all jobs |
list_jobs |
List jobs for a workflow run |
get_job |
Get job details (steps, timing, runner info) |
get_job_logs |
Get raw log output for a job |
Test Analytics
| Tool | Description |
|---|---|
get_job_tests |
Get all test results for a job |
get_failed_tests |
Get failed tests with full error messages |
get_failures_by_pattern |
Group failures by error pattern (e.g., "Cannot read properties") |
compare_test_runs |
Compare failures between two runs (find regressions) |
get_flaky_tests |
Detect tests that fail intermittently |
get_slow_tests |
Find tests exceeding duration threshold |
get_test_history |
Track a specific test's pass/fail history |
get_trends |
Analyze trends: duration, failure rate, test count |
Usage & Billing
| Tool | Description |
|---|---|
get_current_usage |
Current core usage (active vs max) |
get_invoice_amount |
Current billing period amount |
get_usage_summary |
Billable minutes vs free tier |
get_cache_stats |
Cache size, entries by repository |
get_cache_entries |
Detailed cache entries for a repo |
Organization
| Tool | Description |
|---|---|
list_orgs |
List accessible organizations |
get_org_status |
Org status (personal, onboarded, region) |
search_logs |
Search logs across all jobs |
Example Conversations
Debug a Failed Run
You: Why did the last CI run fail?
Claude: Uses
list_runs(status="failure", limit=1)thenget_failed_testsThe last run failed with 34 test failures. The main issues are:
- Connection timeout (12 tests) - Tests in
api/are timing out connecting to the test database- Undefined property (8 tests) -
user.settings.themeis undefined in the new dark mode code- Assertion failures (14 tests) - Snapshot mismatches in UI components
Find Flaky Tests
You: Which tests are flaky?
Claude: Uses
get_flaky_tests(days=7, min_runs=5)Found 3 flaky tests:
Test Flake Rate Pattern should handle concurrent requests40% (4/10 runs) Fails on CI, passes locally renders loading state20% (2/10 runs) Race condition with setTimeout validates auth token30% (3/10 runs) Intermittent network timeout
Compare Branches
You: Did my PR introduce new test failures?
Claude: Uses
compare_test_runswith PR run vs base branch
Tools (21)
list_runsList workflow runs with filters like status, branch, workflow, actor, or PR.get_runGet detailed information about a specific workflow run including all jobs.list_jobsList all jobs associated with a specific workflow run.get_jobGet job details including steps, timing, and runner information.get_job_logsRetrieve the raw log output for a specific job.get_job_testsGet all test results associated with a specific job.get_failed_testsGet failed tests with full error messages for a job.get_failures_by_patternGroup test failures by error pattern to identify common issues.compare_test_runsCompare test failures between two different runs to find regressions.get_flaky_testsDetect tests that fail intermittently across multiple runs.get_slow_testsFind tests that exceed a specific duration threshold.get_test_historyTrack the pass/fail history of a specific test over time.get_trendsAnalyze trends for duration, failure rate, and test count.get_current_usageGet current core usage including active vs maximum cores.get_invoice_amountRetrieve the billing amount for the current period.get_usage_summaryView billable minutes versus free tier usage.get_cache_statsGet cache size and entry counts by repository.get_cache_entriesGet detailed cache entries for a specific repository.list_orgsList all Blacksmith organizations accessible to the user.get_org_statusGet organization status including onboarding and region info.search_logsSearch for specific strings across all job logs.Environment Variables
BLACKSMITH_ORGrequiredYour Blacksmith organization nameBLACKSMITH_SESSION_COOKIESession cookie (auto-extracted from Chrome if not set)Configuration
{"mcpServers":{"blacksmith":{"command":"npx","args":["blacksmith-mcp"],"env":{"BLACKSMITH_ORG":"your-org-name"}}}}