Graph-based tool retrieval for LLM agents to optimize workflow execution.
graph-tool-call
LLM agents can't fit thousands of tool definitions into context. Vector search finds similar tools, but misses the workflow they belong to. graph-tool-call builds a tool graph and retrieves the right chain — not just one match.
| Baseline (all tools) | graph-tool-call | |
|---|---|---|
| 248 tools (K8s API) | 12% accuracy | 82% accuracy |
| 1068 tools (GitHub full API) | impossible (context overflow) | 78% Recall@5 |
| Token usage | 8,192 tokens | 1,699 tokens (79% reduction) |
| Latency (no embedding) | — | 2.7ms avg |
Measured with qwen3:4b (4-bit) — full benchmark below
The Problem
LLM agents need tools. But as tool count grows, two things break:
- Context overflow — 248 Kubernetes API endpoints = 8,192 tokens of tool definitions. The LLM chokes and accuracy drops to 12%.
- Vector search misses workflows — Searching "cancel my order" finds
cancelOrder, but the actual flow islistOrders → getOrder → cancelOrder → processRefund. Vector search returns one tool; you need the chain.
graph-tool-call solves both. It models tool relationships as a graph, retrieves multi-step workflows via hybrid search (BM25 + graph traversal + embedding + MCP annotations), and cuts token usage by 64–91% while maintaining or improving accuracy.
At a Glance
| What you get | How |
|---|---|
| Workflow-aware retrieval | Graph edges encode PRECEDES, REQUIRES, COMPLEMENTARY relations |
| Hybrid search | BM25 + graph traversal + embedding + MCP annotations, fused via wRRF |
| Zero dependencies | Core runs on Python stdlib only — add extras as needed |
| Any tool source | Auto-ingest from OpenAPI / Swagger / MCP / Python functions |
| History-aware | Previously called tools are demoted; next-step tools are boosted |
| MCP Proxy | 172 tools across servers → 3 meta-tools, saving ~1,200 tokens/turn |
Why Not Just Vector Search?
| Scenario | Vector-only | graph-tool-call |
|---|---|---|
| "cancel my order" | Returns cancelOrder |
listOrders → getOrder → cancelOrder → processRefund |
| "read and save file" | Returns read_file |
read_file + write_file (COMPLEMENTARY relation) |
| "delete old records" | Returns any tool matching "delete" | Destructive tools ranked first via MCP annotations |
| "now cancel it" (after listing orders) | No context from history | Demotes used tools, boosts next-step tools |
| Multiple Swagger specs with overlapping tools | Duplicate tools in results | Cross-source auto-deduplication |
| 1,200 API endpoints | Slow, noisy results | Categorized + graph traversal for precise retrieval |
How It Works
OpenAPI / MCP / Python functions → Ingest → Build tool graph → Hybrid retrieve → Agent
Example: User says "cancel my order and process a refund"
Vector search finds cancelOrder. But the actual workflow is:
┌──────────┐
PRECEDES │listOrders│ PRECEDES
┌─────────┤ ├──────────┐
▼ └──────────┘ ▼
┌──────────┐ ┌───────────┐
│ getOrder │ │cancelOrder│
└──────────┘ └─────┬─────┘
│ COMPLEMENTARY
▼
┌──────────────┐
│processRefund │
└──────────────┘
graph-tool-call returns the entire chain, not just one tool. Retrieval combines four signals via weighted Reciprocal Rank Fusion (wRRF):
- BM25 — keyword matching
- Graph traversal — relation-based expansion (PRECEDES, REQUIRES, COMPLEMENTARY)
- Embedding similarity — semantic search (optional, any provider)
- MCP annotations — read-only / destructive / idempotent hints
Installation
The core package has zero dependencies — just Python standard library. Install only what you need:
pip install graph-tool-call # core (BM25 + graph) — no dependencies
pip
Tools (2)
retrieve_toolsRetrieves a chain of tools based on user intent using hybrid search and graph traversal.proxy_toolsAggregates multiple MCP servers into a unified interface using meta-tools.Configuration
{ "mcpServers": { "graph-tool-call": { "command": "python", "args": ["-m", "graph_tool_call.mcp"] } } }