What are the requirements for Atlas Browser?

Atlas Browser requires a compatible MCP client such as Claude Desktop, Claude Code, or Cursor. No additional environment variables are needed for basic setup.

Is Atlas Browser free to use?

Yes, Atlas Browser is open source and free to use. You can find the source code on GitHub.

What MCP clients support Atlas Browser?

Atlas Browser works with any MCP-compatible client including Claude Desktop (Anthropic's official desktop app), Claude Code (CLI tool), Cursor, and other editors with MCP support.

How do I configure Atlas Browser?

Configure Atlas Browser by adding it to your MCP client's config file. The setup block at the top of this page generates a ready-to-paste config for Claude Code, Cursor, Codex, Windsurf, and Claude Desktop.

MCP server/browser-automation

Atlas Browser MCP Server

Q: What tools does Atlas Browser provide?

navigate: Go to URL, returns labeled screenshot. screenshot: Capture current page with labels. click: Click element by label ID [N]. multi_click: Click multiple elements (for CAPTCHA). type: Type text, optionally press Enter. scroll: Scroll page up or down.

Q: How do I install Atlas Browser?

Install Atlas Browser by running: pip install atlas-browser-mcp && playwright install chromium

Visual web browsing for AI agents via Model Context Protocol (MCP).

LingTravel/Atlas-Browser ↗by LingTravelupdated Apr 9, 2026

Add it to Claude Code

claude mcp add atlas-browser -- atlas-browser-mcp

Make your agent remember this setup

atlas-browser's config, env vars, and the gotchas you hit — recalled in every future Claude Code, Cursor, and Codex session.

npx conare@latest

Free · one command · indexes the sessions already on disk. Set up in the browser instead →

What it does

Visual-First navigation using screenshots instead of DOM parsing
Set-of-Mark labeling for interactive elements
Humanized mouse movements and typing rhythms
Multi-click support for solving image-based CAPTCHAs
Built-in anti-detection measures

Tools 6

navigateGo to URL, returns labeled screenshot

screenshotCapture current page with labels

clickClick element by label ID [N]

multi_clickClick multiple elements (for CAPTCHA)

typeType text, optionally press Enter

scrollScroll page up or down

Try it

→Navigate to https://news.ycombinator.com and tell me the top 3 stories

→Go to google.com and search for 'MCP protocol'

→Select all images with traffic lights on this CAPTCHA grid

→Scroll down the page to find the contact information

Original README from LingTravel/Atlas-Browser

🌐 atlas-browser-mcp

Visual web browsing for AI agents via Model Context Protocol (MCP).

✨ Features

📸 Visual-First: Navigate the web through screenshots, not DOM parsing
🏷️ Set-of-Mark: Interactive elements labeled with clickable [0], [1], [2]... markers
🎭 Humanized: Bezier curve mouse movements, natural typing rhythms
🧩 CAPTCHA-Ready: Multi-click support for image selection challenges
🛡️ Anti-Detection: Built-in measures to avoid bot detection

🚀 Quick Start

Installation

pip install atlas-browser-mcp
playwright install chromium

Use with Claude Desktop

Add to your Claude Desktop config (claude_desktop_config.json):

{
  "mcpServers": {
    "browser": {
      "command": "atlas-browser-mcp"
    }
  }
}

Then ask Claude:

"Navigate to https://news.ycombinator.com and tell me the top 3 stories"

🛠️ Available Tools

Tool	Description
`navigate`	Go to URL, returns labeled screenshot
`screenshot`	Capture current page with labels
`click`	Click element by label ID `[N]`
`multi_click`	Click multiple elements (for CAPTCHA)
`type`	Type text, optionally press Enter
`scroll`	Scroll page up or down

📖 Usage Examples

Basic Navigation

User: Go to google.com
AI: [calls navigate(url="https://google.com")]
AI: I see the Google homepage. The search box is labeled [3].

User: Search for "MCP protocol"
AI: [calls click(label_id=3)]
AI: [calls type(text="MCP protocol", submit=true)]
AI: Here are the search results...

CAPTCHA Handling

User: Select all images with traffic lights
AI: [Looking at the CAPTCHA grid]
AI: I can see traffic lights in images [2], [5], and [8].
AI: [calls multi_click(label_ids=[2, 5, 8])]

🔧 Configuration

Headless Mode

For servers without display:

from atlas_browser_mcp.browser import VisualBrowser

browser = VisualBrowser(
    headless=True,   # No visible browser window
    humanize=False   # Faster, less human-like
)

Custom Viewport

browser = VisualBrowser()
browser.VIEWPORT = {"width": 1920, "height": 1080}

🏗️ How It Works

Navigate: Browser loads the page
Inject SoM: JavaScript labels all interactive elements
Screenshot: Capture the labeled page
AI Sees: The screenshot shows [0], [1], [2]... on buttons, links, inputs
AI Acts: "Click [5]" → Browser clicks the element at that position
Repeat: New screenshot with updated labels

┌─────────────────────────────────────┐
│  [0] Logo    [1] Search   [2] Menu  │
│                                     │
│  [3] Article Title                  │
│  [4] Read More                      │
│                                     │
│  [5] Subscribe    [6] Share         │
└─────────────────────────────────────┘

🤝 Integration

With Cline (VS Code)

{
  "mcpServers": {
    "browser": {
      "command": "atlas-browser-mcp"
    }
  }
}

Programmatic Use

from atlas_browser_mcp.browser import VisualBrowser

browser = VisualBrowser()

# Navigate
result = browser.execute("navigate", url="https://example.com")
print(f"Page title: {result.data['title']}")
print(f"Found {result.data['element_count']} interactive elements")

# Click element [0]
result = browser.execute("click", label_id=0)

# Type in focused field
result = browser.execute("type", text="Hello world", submit=True)

# Cleanup
browser.execute("close")

📋 Requirements

Python 3.10+
Playwright with Chromium

🐛 Troubleshooting

"Playwright not installed"

pip install playwright
playwright install chromium

"Browser closed unexpectedly"

Try running with headless=False to see what's happening:

browser = VisualBrowser(headless=False)

Elements not being detected

Some dynamic pages need more wait time. The browser waits 1.5s after navigation, but complex SPAs may need longer.

📄 License

MIT License - see LICENSE

🙏 Credits

Built for Atlas, an autonomous AI agent.

Inspired by:

anthropic/mcp - Model Context Protocol
AskUI - Visual testing approach
Set-of-Mark prompting - Visual grounding technique

Frequently Asked Questions

What are the key features of Atlas Browser?

Visual-First navigation using screenshots instead of DOM parsing. Set-of-Mark labeling for interactive elements. Humanized mouse movements and typing rhythms. Multi-click support for solving image-based CAPTCHAs. Built-in anti-detection measures.

What can I use Atlas Browser for?

Automating complex web tasks that require visual confirmation. Solving CAPTCHA challenges during autonomous web research. Interacting with dynamic websites that are difficult to parse via DOM. Performing human-like web browsing for data collection.

How do I install Atlas Browser?

Install Atlas Browser by running: pip install atlas-browser-mcp && playwright install chromium

What MCP clients work with Atlas Browser?

Atlas Browser works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Conare · memory for coding agents

Turn this server into reusable context

Keep Atlas Browser docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Set up free$npx conare@latest