Question 1

What are the key features of ScreenHand?

Accepted Answer

Native desktop control via Accessibility APIs for macOS and Windows. Full Chrome browser automation using the DevTools Protocol. Smart fallback mechanisms that cycle through multiple interaction methods. In-memory learning system that logs tool calls and saves winning strategies. Automatic per-app spatial understanding and blueprint generation.

Question 2

What can I use ScreenHand for?

Accepted Answer

Automating repetitive data entry tasks across multiple desktop applications. Navigating complex software interfaces that lack keyboard shortcuts. Performing background browser tasks without needing to focus the window. Building cross-app workflows that require interaction with both native and web apps. Creating self-improving automation scripts that learn from previous execution patterns.

Question 3

What tools does ScreenHand provide?

Accepted Answer

desktop_control: Provides 19 tools to click buttons, type text, read UI trees, and navigate menus via native Accessibility APIs.. browser_automation: Provides 15 tools for full Chrome control via DevTools Protocol including navigation, form filling, and stealth clicks.. smart_fallbacks: Provides 8 tools that automatically cycle through Accessibility, CDP, OCR, and coordinate methods for robust interaction..

Question 4

How do I install ScreenHand?

Accepted Answer

Install ScreenHand by running: claude mcp add screenhand -- npx -y screenhand

Question 5

What are the requirements for ScreenHand?

Accepted Answer

ScreenHand requires a compatible MCP client such as Claude Desktop, Claude Code, or Cursor. No additional environment variables are needed for basic setup.

Question 6

Is ScreenHand free to use?

Accepted Answer

Yes, ScreenHand is open source and free to use. You can find the source code on GitHub.

Question 7

What MCP clients support ScreenHand?

Accepted Answer

ScreenHand works with any MCP-compatible client including Claude Desktop (Anthropic's official desktop app), Claude Code (CLI tool), Cursor, and other editors with MCP support.

Question 8

How do I configure ScreenHand?

Accepted Answer

Configure ScreenHand by adding it to your MCP client's config file. The setup block at the top of this page generates a ready-to-paste config for Claude Code, Cursor, Codex, Windsurf, and Claude Desktop.

	Without ScreenHand	With ScreenHand
Click a button	Screenshot → LLM → coordinate click (~3-5s)	Native Accessibility API (~50ms)
Cost per action	1 LLM API call	0 LLM calls
Accuracy	Coordinate guessing — misses on layout shift	Exact element targeting by role/name
Browser control	Needs focus, screenshot per action	CDP in background (~10ms), no focus needed
Works across apps	One app at a time	Cross-app workflows, multi-agent coordination

ScreenHand MCP Server

Add it to Claude Code

Make your agent remember this setup

What it does

Tools 3

Try it

ScreenHand

The Problem

Quick Start

1. Add to your AI client (one step)

2. Grant permissions

3. Browser control (optional)

What It Does

Desktop Control — 19 tools

Browser Automation — 15 tools

Smart Fallbacks — 8 tools

Memory & Learning — 14 tools

App Mastery Map — automatic per-app spatial understanding

Frequently Asked Questions

Turn this server into reusable context