Atlas Browser MCP Server

Local setup required. This server has to be cloned and prepared on your machine before you register it in Claude Code.
1

Set the server up locally

Run this once to clone and prepare the server before adding it to Claude Code.

Run in terminal
pip install atlas-browser-mcp
playwright install chromium
2

Register it in Claude Code

After the local setup is done, run this command to point Claude Code at the built server.

Run in terminal
claude mcp add atlas-browser -- node "<FULL_PATH_TO_ATLAS_BROWSER>/dist/index.js"

Replace <FULL_PATH_TO_ATLAS_BROWSER>/dist/index.js with the actual folder you prepared in step 1.

README.md

Visual web browsing for AI agents via Model Context Protocol (MCP).

🌐 atlas-browser-mcp

Visual web browsing for AI agents via Model Context Protocol (MCP).

✨ Features

  • šŸ“ø Visual-First: Navigate the web through screenshots, not DOM parsing
  • šŸ·ļø Set-of-Mark: Interactive elements labeled with clickable [0], [1], [2]... markers
  • šŸŽ­ Humanized: Bezier curve mouse movements, natural typing rhythms
  • 🧩 CAPTCHA-Ready: Multi-click support for image selection challenges
  • šŸ›”ļø Anti-Detection: Built-in measures to avoid bot detection

šŸš€ Quick Start

Installation

pip install atlas-browser-mcp
playwright install chromium

Use with Claude Desktop

Add to your Claude Desktop config (claude_desktop_config.json):

{
  "mcpServers": {
    "browser": {
      "command": "atlas-browser-mcp"
    }
  }
}

Then ask Claude:

"Navigate to https://news.ycombinator.com and tell me the top 3 stories"

šŸ› ļø Available Tools

Tool Description
navigate Go to URL, returns labeled screenshot
screenshot Capture current page with labels
click Click element by label ID [N]
multi_click Click multiple elements (for CAPTCHA)
type Type text, optionally press Enter
scroll Scroll page up or down

šŸ“– Usage Examples

Basic Navigation

User: Go to google.com
AI: [calls navigate(url="https://google.com")]
AI: I see the Google homepage. The search box is labeled [3].

User: Search for "MCP protocol"
AI: [calls click(label_id=3)]
AI: [calls type(text="MCP protocol", submit=true)]
AI: Here are the search results...

CAPTCHA Handling

User: Select all images with traffic lights
AI: [Looking at the CAPTCHA grid]
AI: I can see traffic lights in images [2], [5], and [8].
AI: [calls multi_click(label_ids=[2, 5, 8])]

šŸ”§ Configuration

Headless Mode

For servers without display:

from atlas_browser_mcp.browser import VisualBrowser

browser = VisualBrowser(
    headless=True,   # No visible browser window
    humanize=False   # Faster, less human-like
)

Custom Viewport

browser = VisualBrowser()
browser.VIEWPORT = {"width": 1920, "height": 1080}

šŸ—ļø How It Works

  1. Navigate: Browser loads the page
  2. Inject SoM: JavaScript labels all interactive elements
  3. Screenshot: Capture the labeled page
  4. AI Sees: The screenshot shows [0], [1], [2]... on buttons, links, inputs
  5. AI Acts: "Click [5]" → Browser clicks the element at that position
  6. Repeat: New screenshot with updated labels
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│  [0] Logo    [1] Search   [2] Menu  │
│                                     │
│  [3] Article Title                  │
│  [4] Read More                      │
│                                     │
│  [5] Subscribe    [6] Share         │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

šŸ¤ Integration

With Cline (VS Code)

{
  "mcpServers": {
    "browser": {
      "command": "atlas-browser-mcp"
    }
  }
}

Programmatic Use

from atlas_browser_mcp.browser import VisualBrowser

browser = VisualBrowser()

# Navigate
result = browser.execute("navigate", url="https://example.com")
print(f"Page title: {result.data['title']}")
print(f"Found {result.data['element_count']} interactive elements")

# Click element [0]
result = browser.execute("click", label_id=0)

# Type in focused field
result = browser.execute("type", text="Hello world", submit=True)

# Cleanup
browser.execute("close")

šŸ“‹ Requirements

  • Python 3.10+
  • Playwright with Chromium

šŸ› Troubleshooting

"Playwright not installed"

pip install playwright
playwright install chromium

"Browser closed unexpectedly"

Try running with headless=False to see what's happening:

browser = VisualBrowser(headless=False)

Elements not being detected

Some dynamic pages need more wait time. The browser waits 1.5s after navigation, but complex SPAs may need longer.

šŸ“„ License

MIT License - see LICENSE

šŸ™ Credits

Built for Atlas, an autonomous AI agent.

Inspired by:

Tools (6)

navigateGo to URL, returns labeled screenshot
screenshotCapture current page with labels
clickClick element by label ID [N]
multi_clickClick multiple elements (for CAPTCHA)
typeType text, optionally press Enter
scrollScroll page up or down

Configuration

claude_desktop_config.json
{"mcpServers": {"browser": {"command": "atlas-browser-mcp"}}}

Try it

→Navigate to https://news.ycombinator.com and tell me the top 3 stories
→Go to google.com and search for 'MCP protocol'
→Select all images with traffic lights on this CAPTCHA grid
→Scroll down the page to find the contact information

Frequently Asked Questions

What are the key features of Atlas Browser?

Visual-First navigation using screenshots instead of DOM parsing. Set-of-Mark labeling for interactive elements. Humanized mouse movements and typing rhythms. Multi-click support for solving image-based CAPTCHAs. Built-in anti-detection measures.

What can I use Atlas Browser for?

Automating complex web tasks that require visual confirmation. Solving CAPTCHA challenges during autonomous web research. Interacting with dynamic websites that are difficult to parse via DOM. Performing human-like web browsing for data collection.

How do I install Atlas Browser?

Install Atlas Browser by running: pip install atlas-browser-mcp && playwright install chromium

What MCP clients work with Atlas Browser?

Atlas Browser works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep Atlas Browser docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare