Volt HQ MCP Server

The compute price oracle for AI agents.

ai-toolsaipricingcomputeinference
README.md

Volt HQ

The compute price oracle for AI agents.

What it does

  • Compares pricing across 8 providers (OpenAI, Anthropic, Groq, Together AI, DeepInfra, Fireworks AI, Hyperbolic, Akash) — 106+ offerings with live API pricing
  • Recommends optimal routing — tells your agent where to get the same quality for less, with savings estimates
  • Tracks spend and budgets — spending summaries by provider/model, savings reports, and threshold alerts

Install

Auto-configure Cursor and Claude Desktop in one command:

npx volthq-mcp-server --setup

Detects installed clients, merges config without overwriting your existing MCP servers.

Manual setup

Cursor — add to .cursor/mcp.json:

{
  "mcpServers": {
    "volthq": {
      "command": "npx",
      "args": ["-y", "volthq-mcp-server"]
    }
  }
}

Claude Desktop — add to claude_desktop_config.json:

{
  "mcpServers": {
    "volthq": {
      "command": "npx",
      "args": ["-y", "volthq-mcp-server"]
    }
  }
}

Tools

Tool Description
volt_check_price Compare pricing across providers for a model
volt_recommend_route Get optimal provider recommendation with savings estimate
volt_get_spend Spending summary by provider and model (today/7d/30d)
volt_get_savings Actual spend vs optimized spend comparison
volt_set_budget_alert Set daily/weekly/monthly budget threshold alerts

Example

> volt_check_price { "model": "llama-70b" }

Price comparison for "llama-70b" — 8 offerings found
────────────────────────────────────────────────────────────
1. DeepInfra — Llama-70B
   Input: $0.20/M tokens | Output: $0.27/M tokens | Avg: $0.24/M
   Quality: 88% | Region: global

2. Hyperbolic — Llama-70B (FP8) on H100-SXM
   Input: $0.40/M tokens | Output: $0.40/M tokens | Avg: $0.40/M
   Quality: 85% | Region: global

3. Hyperbolic — Llama-70B (BF16) on H100-SXM
   Input: $0.55/M tokens | Output: $0.55/M tokens | Avg: $0.55/M
   Quality: 88% | Region: global

4. Groq — Llama-70B
   Input: $0.59/M tokens | Output: $0.79/M tokens | Avg: $0.69/M
   Quality: 88% | Region: global

5. Fireworks AI — Llama-70B
   Input: $0.70/M tokens | Output: $0.70/M tokens | Avg: $0.70/M
   Quality: 88% | Region: global

6. Together AI — Llama-70B
   Input: $0.88/M tokens | Output: $0.88/M tokens | Avg: $0.88/M
   Quality: 88% | Region: global

7. Akash — Llama-70B (FP8) on H100-SXM
   Input: $3.49/M tokens | Output: $8.72/M tokens | Avg: $6.11/M
   Quality: 85% | Region: global

8. Akash — Llama-70B (FP8) on A100-80GB
   Input: $5.24/M tokens | Output: $13.11/M tokens | Avg: $9.18/M
   Quality: 85% | Region: global

Cheapest is 97% less than most expensive option.

DeepInfra at $0.24/M, Hyperbolic at $0.40/M, Groq at $0.69/M, Fireworks AI at $0.70/M — all vs GPT-4o at $6.25/M.

Supported providers

  • OpenAI — GPT-4o, GPT-4o-mini
  • Anthropic — Claude Sonnet 4.6, Claude Haiku 4.5
  • Groq — Llama-70B, Llama-8B, Mixtral-8x7B
  • Together AI — Llama-70B, Llama-8B, DeepSeek-V3
  • DeepInfra — 75+ models with live API pricing (Llama, DeepSeek, Qwen, Mistral, Gemma, and more)
  • Fireworks AI — Llama-70B, Llama-8B, DeepSeek-R1
  • Hyperbolic — DeepSeek-V3, DeepSeek-R1, Llama-70B, Llama-8B
  • Akash — Llama-70B, Llama-8B on H100 and A100 (live GPU pricing)

Links

License

MIT

Tools 5

volt_check_priceCompare pricing across providers for a model
volt_recommend_routeGet optimal provider recommendation with savings estimate
volt_get_spendSpending summary by provider and model (today/7d/30d)
volt_get_savingsActual spend vs optimized spend comparison
volt_set_budget_alertSet daily/weekly/monthly budget threshold alerts

Try it

Compare the current pricing for llama-70b across all supported providers.
Which provider offers the best price for running Claude Sonnet 4.6 right now?
Show me a summary of my AI compute spending over the last 30 days.
How much money could I have saved on my recent model usage by using optimal routing?
Set a budget alert for my daily AI compute spend at $5.00.

Frequently Asked Questions

What are the key features of Volt HQ?

Compares live API pricing across 8 providers including OpenAI, Anthropic, and Hyperbolic. Provides optimal routing recommendations to reduce compute costs by up to 80%. Tracks spending summaries by provider and model. Generates savings reports comparing actual spend vs optimized spend. Configures threshold alerts for daily, weekly, or monthly budgets.

What can I use Volt HQ for?

Developers looking to minimize API costs for high-volume LLM applications. AI agents needing to dynamically select the cheapest provider for a specific model. Teams tracking and auditing their monthly AI infrastructure expenditure. Engineers evaluating the cost-efficiency of different GPU providers like Akash or Hyperbolic.

How do I install Volt HQ?

Install Volt HQ by running: npx volthq-mcp-server --setup

What MCP clients work with Volt HQ?

Volt HQ works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Conare · memory for coding agents

Turn this server into reusable context

Keep Volt HQ docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Set up free$npx conare@latest