LLMKit MCP Server

1

Add it to Claude Code

Run this in a terminal.

Run in terminal
claude mcp add -e "LLMKIT_API_KEY=${LLMKIT_API_KEY}" llmkit -- npx -y @f3d1/llmkit-mcp
Required:LLMKIT_API_KEY
README.md

Know exactly what your AI agents cost.

LLMKit

Know exactly what your AI agents cost.


Open-source API gateway that sits between your app and AI providers. Every request gets logged with token counts and dollar costs. Budget limits reject requests when exceeded, not after.

Why LLMKit

Most cost tracking tools give you "soft limits" that agents blow past in the first hour. LLMKit runs cost estimation before every request. If it would exceed the budget, the request gets rejected before reaching the provider. Per-key or per-session scope.

Tag requests with a session ID or end-user ID to track costs per agent, per conversation, per user. The dashboard and MCP server surface this data in real time. Cost anomaly detection alerts when a single request costs 3x the recent median.

11 providers through one interface: Anthropic, OpenAI, Google Gemini, Groq, Together, Fireworks, DeepSeek, Mistral, xAI, Ollama, OpenRouter. Fallback chains with one header (x-llmkit-fallback: anthropic,openai,gemini).

Runs on Cloudflare Workers at the edge. Cache-aware pricing for Anthropic, DeepSeek, and Fireworks prompt caching. 45+ models priced. Open source, MIT licensed.

How it works

flowchart TD
    A["Your app"] --> B["LLMKit Proxy"]
    B --> C["AI Provider"]
    C --> B
    B --> D["Supabase"]
    D --> E["Dashboard"]
    D --> F["MCP Server"]

Auth, budget check, route to provider (with fallback), log tokens and costs, update budget, fire alerts at 80%.

Get started

  1. Create an account at llmkit-dashboard.vercel.app (free while in beta)
  2. Create an API key in the Keys tab
  3. Use it: pick any method below

CLI

Wrap any command. The CLI intercepts OpenAI and Anthropic API calls, forwards them through the proxy, and prints a cost summary when the process exits. No code changes.

npx @f3d1/llmkit-cli -- python my_agent.py
LLMKit Cost Summary
---
Total: $0.0215 (3 requests, 4.2s)

By model:
  claude-sonnet-4-20250514  1 req   $0.0156
  gpt-4o                    2 reqs  $0.0059

Works with Python, Ruby, Go, Rust, anything that calls the OpenAI or Anthropic API. Use -v for per-request costs as they happen, --json for machine-readable output.

Python

pip install llmkit-sdk

Two ways to track costs:

With the proxy (budget enforcement, logging, dashboard):

from openai import OpenAI

client = OpenAI(
    base_url="https://llmkit-proxy.smigolsmigol.workers.dev/v1",
    api_key="llmk_your_key_here",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "hello"}],
)

Without the proxy (local cost estimation, zero setup):

from llmkit import tracked
from openai import OpenAI

client = OpenAI(http_client=tracked())

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "hello"}],
)
# costs estimated locally from bundled pricing table

tracked() wraps your HTTP client and estimates costs from token usage. No proxy needed. Works with any SDK that accepts http_client. See the SDK docs for all options.

TypeScript

npm install @f3d1/llmkit-sdk
import { LLMKit } from '@f3d1/llmkit-sdk'

const kit = new LLMKit({ apiKey: process.env.LLMKIT_KEY })
const agent = kit.session()

const res = await agent.chat({
  provider: 'anthropic',
  model: 'claude-sonnet-4-20250514',
  messages: [{ role: 'user', content: 'summarize this document' }],
})

console.log(res.content)
console.log(res.cost)   // { inputCost: 0.003, outputCost: 0.015, totalCost: 0.018, currency: 'USD' }

Streaming, CostTracker (local cost tracking without the proxy), and Vercel AI SDK provider also available. See the package README for deta

Tools (2)

get_usage_statsRetrieve usage statistics and cost data for AI requests.
get_budget_statusCheck current budget limits and remaining balance.

Environment Variables

LLMKIT_API_KEYrequiredAPI key generated from the LLMKit dashboard.

Configuration

claude_desktop_config.json
{"mcpServers": {"llmkit": {"command": "npx", "args": ["-y", "@f3d1/llmkit-mcp"], "env": {"LLMKIT_API_KEY": "your_key_here"}}}}

Try it

What is my current AI spending for this month?
Check if I have exceeded my budget for the current session.
Show me the cost breakdown for my recent AI agent requests.
Are there any cost anomalies in my recent AI usage?

Frequently Asked Questions

What are the key features of LLMKit?

Real-time cost tracking across 11 AI providers. Pre-request budget enforcement to prevent overspending. Cost anomaly detection for individual requests. Session-based and user-based cost attribution. Support for Anthropic, OpenAI, Google Gemini, and more.

What can I use LLMKit for?

Preventing runaway costs during long-running AI agent tasks. Monitoring budget consumption for team-shared API keys. Analyzing cost-per-conversation for customer support bots. Detecting unexpected spikes in token usage or costs.

How do I install LLMKit?

Install LLMKit by running: npx -y @f3d1/llmkit-mcp

What MCP clients work with LLMKit?

LLMKit works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep LLMKit docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare