CUA MCP Server

1

Add it to Claude Code

Run this in a terminal.

Run in terminal
claude mcp add -e "CUA_API_KEY=${CUA_API_KEY}" cua-mcp-server -- npx -y mcp-remote https://cua-mcp-server.vercel.app/mcp
Required:CUA_API_KEY
README.md

Delegate desktop automation tasks to an autonomous vision-based agent.

CUA MCP Server

An agentic Model Context Protocol (MCP) server for CUA Cloud - delegate desktop automation tasks to an autonomous vision-based agent. Images never leave the server; only text summaries are returned.

Production URL: https://cua-mcp-server.vercel.app/mcp

What is CUA?

CUA (Computer Use Agent) provides cloud-based virtual machine sandboxes that AI agents can control. This MCP server exposes CUA's capabilities through a clean task-delegation API:

  • Create and manage VMs (Linux, Windows, macOS)
  • Delegate tasks - "Open Chrome and navigate to google.com"
  • Get text summaries - No images in your context window
  • Query screen state - Vision-based descriptions without taking action

Architecture

Claude Code (Orchestrator)
    │
    │ run_task("Open Chrome and go to google.com")
    ▼
┌─────────────────────────────────────────────────────────────┐
│  CUA MCP Server (Agentic)                                   │
│  ┌───────────────────────────────────────────────────────┐  │
│  │  Internal Agent Loop                                  │  │
│  │  1. screenshot() → CUA sandbox                        │  │
│  │  2. screenshot → Claude API (computer_use tool)       │  │
│  │  3. Claude returns: click(x,y) / type("text") / done  │  │
│  │  4. Execute action on sandbox                         │  │
│  │  5. Loop until complete                               │  │
│  └───────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
{ success: true, summary: "Opened Chrome...", steps_taken: 5 }
(TEXT ONLY - no images)

Project Structure

api/mcp.ts                     # MCP protocol handler
lib/
├── agent/                     # Modular agent architecture
│   ├── index.ts               # Public exports
│   ├── types.ts               # Type definitions
│   ├── config.ts              # Model configurations
│   ├── validation.ts          # Coordinate validation helpers
│   ├── execute.ts             # Main agent loop
│   ├── describe.ts            # Screen description
│   ├── progress.ts            # Progress tracking
│   ├── utils.ts               # Utilities (sleep, generateTaskId)
│   └── actions/               # Action handler registry (16 handlers)
├── cua-client.ts              # CUA Cloud API client
└── tool-schemas.ts            # MCP tool definitions

Available Tools (9 total)

Sandbox Management (5 tools)

Tool Description
list_sandboxes List all CUA cloud sandboxes with their current status
get_sandbox Get details of a specific sandbox including API URLs
start_sandbox Start a stopped sandbox
stop_sandbox Stop a running sandbox
restart_sandbox Restart a sandbox

Note: Create and delete sandboxes via the CUA Dashboard - the Cloud API doesn't expose these operations.

Agentic Tools (4 tools)

Tool Description
describe_screen Get a text description of current screen state using vision AI. No actions taken.
run_task Execute a computer task autonomously. Returns immediately with task_id for polling.
get_task_progress Poll progress of running tasks. Returns current step, last action, and reasoning.
get_task_history Retrieve results of a previously executed task by ID.

Quick Start

1. Get a CUA API Key

  1. Go to cua.ai/signin
  2. Navigate to Dashboard > API Keys > New API Key
  3. Copy your API key (starts with sk_cua-api01_...)

2. Configure Claude Code

Add to your ~/.claude.json:

{
  "mcpServers": {
    "cua": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://cua-mcp-server.vercel.app/mcp"]
    }
  }
}

3. Use with Claude Code

You: "List my CUA sandboxes"
Claude: [Uses list_sandboxes tool]

You: "Start my-sandbox"
Claude: [Uses start_sandbox tool]

You: "Open Firefox and go to google.com on my-sandbox"
Claude: [Uses run_task with task="Open Firefox and navigate to google.com"]
→ Returns: { success: true, summary: "Opened Firefox, navigated to google.com", steps_taken: 4 }

You: "What's currently on the screen?"
Claude: [Uses describe_screen tool]
→ Returns: { description: "Firefox browser showing Google homepage with search box..." }

Usage Examples

Automate a Web Task

You: "On my-sandbox, open Chrome, go to github.com, and search for 'mcp server'"

Claude uses run_task:
- task: "Open Chrome browser, navigate to github.com, find the search box, type 'mcp server' and press Enter"
- Returns summary of what happened (no screenshots in your context)

Check Screen State

You: "What's on the screen right now?"

Claude uses describe_screen:
- focus: "ui" (or "text" or "full")
- Returns text description of UI elements, buttons, text content

Ask Specific Questions

Yo

Tools (9)

list_sandboxesList all CUA cloud sandboxes with their current status
get_sandboxGet details of a specific sandbox including API URLs
start_sandboxStart a stopped sandbox
stop_sandboxStop a running sandbox
restart_sandboxRestart a sandbox
describe_screenGet a text description of current screen state using vision AI
run_taskExecute a computer task autonomously
get_task_progressPoll progress of running tasks
get_task_historyRetrieve results of a previously executed task by ID

Environment Variables

CUA_API_KEYrequiredAPI key for authenticating with CUA Cloud services

Configuration

claude_desktop_config.json
{"mcpServers": {"cua": {"command": "npx", "args": ["-y", "mcp-remote", "https://cua-mcp-server.vercel.app/mcp"]}}}

Try it

List my available CUA sandboxes.
Start my-sandbox and open Firefox to navigate to github.com.
What is currently on the screen of my-sandbox?
Check the progress of my last automation task.
Search for 'mcp server' on Google using the Chrome browser in my-sandbox.

Frequently Asked Questions

What are the key features of CUA MCP Server?

Autonomous desktop automation via vision-based agents. Cloud-based virtual machine sandbox management. Text-only screen summaries to keep context windows clean. Support for Linux, Windows, and macOS environments. Asynchronous task execution with polling capabilities.

What can I use CUA MCP Server for?

Automating repetitive web-based data entry tasks. Testing software across different OS environments in the cloud. Retrieving information from legacy desktop applications without direct API access. Monitoring UI states of remote applications via AI-generated descriptions.

How do I install CUA MCP Server?

Install CUA MCP Server by running: npx -y mcp-remote https://cua-mcp-server.vercel.app/mcp

What MCP clients work with CUA MCP Server?

CUA MCP Server works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep CUA MCP Server docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare