Blacksmith CI MCP Server

$claude mcp add blacksmith -- npx blacksmith-mcp
README.md

Connect Claude to your Blacksmith CI data to query workflow runs and metrics.

Blacksmith MCP

An MCP server that connects Claude to your Blacksmith CI data. Query workflow runs, analyze test failures, detect flaky tests, and monitor usage—all through natural conversation.

Why?

Debugging CI failures usually means clicking through dashboards, copying run IDs, and piecing together information across multiple pages. With this MCP, you can just ask:

  • "Why did the last CI run fail?"
  • "Which tests are flaky this week?"
  • "Compare test failures between main and my PR"
  • "What's using the most cache storage?"

Claude handles the API calls and gives you actionable insights.

Quick Start

Zero-config if you're logged into Blacksmith in Chrome:

# Add to Claude Code
claude mcp add blacksmith -- npx blacksmith-mcp

# Set your org (run once)
export BLACKSMITH_ORG="your-org-name"

The MCP automatically extracts your session from Chrome cookies. No manual token copying needed.

Installation

Option 1: Claude Code CLI

claude mcp add blacksmith -- npx blacksmith-mcp

Option 2: Project Configuration

Add to your .mcp.json:

{
  "mcpServers": {
    "blacksmith": {
      "type": "stdio",
      "command": "npx",
      "args": ["blacksmith-mcp"],
      "env": {
        "BLACKSMITH_ORG": "your-org-name"
      }
    }
  }
}

Option 3: Global Install

npm install -g blacksmith-mcp

Configuration

Authentication

Automatic (recommended): Log into app.blacksmith.sh in Chrome. The MCP extracts your session cookie automatically.

Manual: Set BLACKSMITH_SESSION_COOKIE environment variable with your session cookie value.

Environment Variables

Variable Required Description
BLACKSMITH_ORG Yes Your Blacksmith organization name
BLACKSMITH_SESSION_COOKIE No Session cookie (auto-extracted from Chrome if not set)

Available Tools

Workflow Runs

Tool Description
list_runs List workflow runs with filters (status, branch, workflow, actor, PR)
get_run Get run details including all jobs
list_jobs List jobs for a workflow run
get_job Get job details (steps, timing, runner info)
get_job_logs Get raw log output for a job

Test Analytics

Tool Description
get_job_tests Get all test results for a job
get_failed_tests Get failed tests with full error messages
get_failures_by_pattern Group failures by error pattern (e.g., "Cannot read properties")
compare_test_runs Compare failures between two runs (find regressions)
get_flaky_tests Detect tests that fail intermittently
get_slow_tests Find tests exceeding duration threshold
get_test_history Track a specific test's pass/fail history
get_trends Analyze trends: duration, failure rate, test count

Usage & Billing

Tool Description
get_current_usage Current core usage (active vs max)
get_invoice_amount Current billing period amount
get_usage_summary Billable minutes vs free tier
get_cache_stats Cache size, entries by repository
get_cache_entries Detailed cache entries for a repo

Organization

Tool Description
list_orgs List accessible organizations
get_org_status Org status (personal, onboarded, region)
search_logs Search logs across all jobs

Example Conversations

Debug a Failed Run

You: Why did the last CI run fail?

Claude: Uses list_runs(status="failure", limit=1) then get_failed_tests

The last run failed with 34 test failures. The main issues are:

  • Connection timeout (12 tests) - Tests in api/ are timing out connecting to the test database
  • Undefined property (8 tests) - user.settings.theme is undefined in the new dark mode code
  • Assertion failures (14 tests) - Snapshot mismatches in UI components

Find Flaky Tests

You: Which tests are flaky?

Claude: Uses get_flaky_tests(days=7, min_runs=5)

Found 3 flaky tests:

Test Flake Rate Pattern
should handle concurrent requests 40% (4/10 runs) Fails on CI, passes locally
renders loading state 20% (2/10 runs) Race condition with setTimeout
validates auth token 30% (3/10 runs) Intermittent network timeout

Compare Branches

You: Did my PR introduce new test failures?

Claude: Uses compare_test_runs with PR run vs base branch

Tools (21)

list_runsList workflow runs with filters like status, branch, workflow, actor, or PR.
get_runGet detailed information about a specific workflow run including all jobs.
list_jobsList all jobs associated with a specific workflow run.
get_jobGet job details including steps, timing, and runner information.
get_job_logsRetrieve the raw log output for a specific job.
get_job_testsGet all test results associated with a specific job.
get_failed_testsGet failed tests with full error messages for a job.
get_failures_by_patternGroup test failures by error pattern to identify common issues.
compare_test_runsCompare test failures between two different runs to find regressions.
get_flaky_testsDetect tests that fail intermittently across multiple runs.
get_slow_testsFind tests that exceed a specific duration threshold.
get_test_historyTrack the pass/fail history of a specific test over time.
get_trendsAnalyze trends for duration, failure rate, and test count.
get_current_usageGet current core usage including active vs maximum cores.
get_invoice_amountRetrieve the billing amount for the current period.
get_usage_summaryView billable minutes versus free tier usage.
get_cache_statsGet cache size and entry counts by repository.
get_cache_entriesGet detailed cache entries for a specific repository.
list_orgsList all Blacksmith organizations accessible to the user.
get_org_statusGet organization status including onboarding and region info.
search_logsSearch for specific strings across all job logs.

Environment Variables

BLACKSMITH_ORGrequiredYour Blacksmith organization name
BLACKSMITH_SESSION_COOKIESession cookie (auto-extracted from Chrome if not set)

Configuration

claude_desktop_config.json
{"mcpServers":{"blacksmith":{"command":"npx","args":["blacksmith-mcp"],"env":{"BLACKSMITH_ORG":"your-org-name"}}}}

Try it

Why did the last CI run fail?
Which tests are flaky this week?
Compare test failures between main and my PR
What's using the most cache storage?
Did my PR introduce new test failures?

Frequently Asked Questions

What are the key features of Blacksmith CI?

Automatic session extraction from Chrome cookies for zero-config authentication.. Deep CI/CD analytics including job logs, step timing, and runner info.. Advanced test failure analysis with pattern grouping and regression detection.. Usage and billing monitoring for core usage, cache stats, and invoice amounts.. Cross-job log searching to find specific errors across the entire organization..

What can I use Blacksmith CI for?

Debugging CI failures without manually clicking through dashboard pages.. Identifying and tracking flaky tests that intermittently break the build.. Comparing PR test results against the base branch to identify regressions.. Monitoring infrastructure costs and cache efficiency across repositories.. Analyzing test suite performance to find and optimize slow-running tests..

How do I install Blacksmith CI?

Install Blacksmith CI by running: claude mcp add blacksmith -- npx blacksmith-mcp

What MCP clients work with Blacksmith CI?

Blacksmith CI works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Use Blacksmith CI with Conare

Manage MCP servers visually, upload persistent context, and never start from zero with Claude Code & Codex.

Try Free