PyAutoGUI Multinode MCP Server

$git clone https://github.com/stonehill-2345/mcp-autogui-multinode.git && cd mcp-autogui-multinode && uv sync --group gui
README.md

An MCP and HTTP server wrapper for PyAutoGUI for mouse and keyboard control.

English|δΈ­ζ–‡

About

An MCP and HTTP server wrapper for PyAutoGUI, enabling LLMs to control your mouse and keyboard.

Architecture

The service supports two deployment architectures:

LLM -> MCP -> TOOL (Remote Tool Service)

This architecture separates the MCP server from the tool service, allowing the MCP server to connect to a remote tool service via HTTP.

graph LR
    LLM[LLM Client] -->|MCP Protocol| MCP[MCP Servermain.pyClient-based Tools]
    MCP -->|HTTP APIwith API Key| TOOLA[Tool Servicetool.pyHTTP API Server]
    MCP -->|HTTP APIwith API Key| TOOLB[Tool Servicetool.pyHTTP API Server]

    TOOLA -->|PyAutoGUI| COMPUTERA[Computer Control]
    TOOLB -->|PyAutoGUI| COMPUTERB[Computer Control]
    style LLM fill:#e1f5ff
    style MCP fill:#fff4e1
    style TOOLA fill:#ffe1f5
    style COMPUTERA fill:#e1ffe1
    style COMPUTERB fill:#e1ffe1

Characteristics:

  • MCP server uses client-based tools (register_computer_tools_with_client)
  • MCP server forwards requests to remote tool service via HTTP
  • Tool service performs actual computer control operations
  • Suitable for distributed deployments where MCP server and tool service run on different machines
  • Requires endpoint parameter in MCP tool calls
Architecture 2: LLM -> MCP (Direct Tools)

This architecture uses direct tools where the MCP server directly performs computer control operations.

graph LR
    LLM[LLM Client] -->|MCP Protocolstdio/http| MCP[MCP Servermcp_local.pyDirect Tools]
    MCP -->|PyAutoGUI| COMPUTER[Computer Control]
    
    style LLM fill:#e1f5ff
    style MCP fill:#fff4e1
    style COMPUTER fill:#e1ffe1

Characteristics:

  • MCP server uses direct tools (register_computer_tools)
  • MCP server directly executes computer control operations
  • No separate tool service required
  • Suitable for local deployments where everything runs on the same machine
  • No endpoint parameter needed in MCP tool calls

Features

  • πŸš€ Dual Protocol Support: HTTP REST API and MCP (Model Context Protocol)
  • πŸ” API Key Authentication: Optional API key authentication for service-to-service communication
  • 🌐 Multiple MCP Transports: Support for both HTTP and stdio (Standard Input/Output) transport modes
  • πŸ–±οΈ Mouse Control: Move, click, drag, scroll operations
  • ⌨️ Keyboard Control: Press keys, type text, key combinations
  • πŸ“Έ Screenshot: Capture screen and get base64-encoded images
  • πŸ“Š Screen Info: Get cursor position and screen resolution
  • βš™οΈ Configuration Management: Pydantic Settings with environment variable support
  • πŸ“ Auto Documentation: Swagger UI for HTTP API
  • πŸ”§ Flexible Deployment: Run HTTP server or MCP server independently
  • πŸ“‹ Request Tracing: Request ID middleware for request tracking
  • πŸ“ Structured Logging: Loguru-based logging with request ID integration
  • πŸ”Œ Remote MCP Support: Optional HTTP client for remote tool server integration

Quick Start

Prerequisites

  • Python >= 3.12
  • uv package manager (recommended)

Installation

  1. Clone the repository:
git clone https://github.com/stonehill-2345/mcp-autogui-multinode.git
cd mcp-autogui-multinode
  1. Install dependencies based on your deployment scenario:
Local Full Development

For local development with all features (GUI control + testing):

uv sync --group gui --group dev
Deploy MCP Server Only

For deploying MCP server that connects to remote tool service (no GUI dependencies needed):

uv sync --no-group gui
Deploy Tool Service Only

For deploying HTTP tool service that performs actual computer control (requires GUI):

uv sync --group gui

Running the Service

The service supports two independent servers:

1. Run Tool Service (HTTP API)

Starts the HTTP API server for computer control:

uv run python tool.py
2. Run MCP Server

Starts the MCP server that can connect to remote tool services. The server supports two transport modes:

HTTP Transport Mode:

uv run python mcp_local.py http

stdio Transport Mode (default):

uv run python mcp_local.py stdio

After starting, you can access:

API Endpoints

Base Endpoints

  • GET / - Root path, returns API information
  • GET /health - Health check endpoint

Computer Control Endpoints

All computer control actions are available at:

  • POST /api/computer/{action} - Execute a computer control action

Tools (3)

computer_controlExecute a computer control action like click, type, or scroll.
screenshotCapture screen and get base64-encoded images.
screen_infoGet cursor position and screen resolution.

Environment Variables

API_KEYOptional API key authentication for service-to-service communication

Configuration

claude_desktop_config.json
{
  "mcpServers": {
    "autogui-multinode": {
      "command": "uv",
      "args": [
        "run",
        "python",
        "mcp_local.py",
        "stdio"
      ],
      "env": {
        "API_KEY": "your_optional_key"
      }
    }
  }
}

Try it

β†’Take a screenshot of my current screen.
β†’Move the mouse to the center of the screen and click.
β†’Type 'Hello World' into the active window.
β†’What is my current screen resolution and cursor position?
β†’Scroll down the page by 500 units.

Frequently Asked Questions

How do I install PyAutoGUI Multinode?

Install PyAutoGUI Multinode by running: git clone https://github.com/stonehill-2345/mcp-autogui-multinode.git && cd mcp-autogui-multinode && uv sync --group gui

What MCP clients work with PyAutoGUI Multinode?

PyAutoGUI Multinode works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Use PyAutoGUI Multinode with Conare

Manage MCP servers visually, upload persistent context, and never start from zero with Claude Code & Codex.

Try Free