OpenAI Token Manager MCP Server
A Model Context Protocol (MCP) server that provides intelligent OpenAI API token usage management with automatic model switching capabilities.
Features
- Automatic Model Switching: Automatically switches between model tiers (gpt-4o → gpt-4o-mini) when token limits are reached
- Daily Token Tracking: Tracks token usage per model with daily reset functionality
- Token Estimation: Estimate token usage before making API calls
- Progress Tracking: Resume processing from where you left off
- Configurable Limits: Customizable token limits and model tiers
- Comprehensive Logging: Detailed logging for debugging and monitoring
Installation
- Clone or download this repository
- Install the package:
pip install -e .
Quick Start
Using with Claude Desktop
Add this server to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"openai-token-manager": {
"command": "python",
"args": ["-m", "openai_token_manager_mcp.server"],
"env": {
"OPENAI_API_KEY": "your-openai-api-key-here"
}
}
}
}
Using Programmatically
# Run the MCP server
python -m openai_token_manager_mcp.server
Available Tools
`initialize_token_manager`
Initialize the token manager with a specific project directory.
Parameters:
project_dir(string): Path to store token usage data
`get_token_status`
Get current token usage status and model information.
Returns: JSON with current model, usage statistics, and available models.
`estimate_tokens`
Estimate token usage for given prompts before making API calls.
Parameters:
system_prompt(string): The system promptuser_prompt(string): The user promptmodel(string, optional): Model to estimate for
`call_openai_with_management`
Call OpenAI API with automatic token management and model switching.
Parameters:
system_prompt(string): The system promptuser_prompt(string): The user promptresponse_format(string, optional): "json" for JSON response formattimeout(integer, optional): Request timeout in seconds (default: 45)force_model(string, optional): Force specific model (bypasses automatic switching)dry_run(boolean, optional): Simulate without making actual API call
`switch_model`
Manually switch to the next available model tier.
`reset_daily_usage`
Reset daily token usage counters.
Configuration
Model Tiers
The default configuration includes:
MODEL_TIERS = [
{"name": "gpt-4o", "max_tokens": 250_000, "stop_at": 240_000},
{"name": "gpt-4o-mini", "max_tokens": 2_500_000, "stop_at": 2_450_000}
]
You can modify these in the server.py file to match your needs.
Environment Variables
OPENAI_API_KEY: Your OpenAI API key (required)
Example Usage
Basic Token Management
# Initialize for a specific project
await call_tool("initialize_token_manager", {"project_dir": "/path/to/project"})
# Check current status
status = await call_tool("get_token_status", {})
# Estimate tokens before calling
estimate = await call_tool("estimate_tokens", {
"system_prompt": "You are a helpful assistant.",
"user_prompt": "What is the weather like?"
})
# Make managed API call
response = await call_tool("call_openai_with_management", {
"system_prompt": "You are a helpful assistant.",
"user_prompt": "Explain quantum computing in simple terms.",
"response_format": "json"
})
Advanced Usage
# Force a specific model
response = await call_tool("call_openai_with_management", {
"system_prompt": "You are a helpful assistant.",
"user_prompt": "Write a short story.",
"force_model": "gpt-4o",
"timeout": 60
})
# Dry run to test without API calls
dry_response = await call_tool("call_openai_with_management", {
"system_prompt": "You are a helpful assistant.",
"user_prompt": "Analyze this data.",
"dry_run": True
})
# Manually switch models
await call_tool("switch_model", {})
# Reset usage for new day
await call_tool("reset_daily_usage", {})
File Structure
When initialized, the token manager creates the following structure:
project_directory/
├── project_state/
│ └── token_usage.json # Token usage tracking
├── project_logs/
│ └── token_manager.log # Detailed logs
└── project_output/ # For any output files
Error Handling
The server includes comprehensive error handling:
- Rate Limiting: Automatic retry with exponential backoff
- Model Exhaustion: Graceful handling when all model tiers are exhausted
- API Errors: Detailed logging and error messages
- File Operations: Safe file handling with proper error reporting
Roadmap & Future Updates
Planned Features
- **Multi-Provid
Tools 6
initialize_token_managerInitialize the token manager with a specific project directory.get_token_statusGet current token usage status and model information.estimate_tokensEstimate token usage for given prompts before making API calls.call_openai_with_managementCall OpenAI API with automatic token management and model switching.switch_modelManually switch to the next available model tier.reset_daily_usageReset daily token usage counters.Environment Variables
OPENAI_API_KEYrequiredYour OpenAI API key