AgentBrowser MCP Server

1

Add it to Claude Code

Run this in a terminal.

Run in terminal
claude mcp add -e "ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}" agent-browser -- npx -y @ashtonvaughan/agentbrowser
Required:ANTHROPIC_API_KEY
README.md

A browser runtime built for AI agents. Not humans.

AgentBrowser

A browser runtime built for AI agents. Not humans.

Every existing browser automation tool — Playwright, Puppeteer, Selenium, browser-use — was built for humans first and retrofitted for agents. They speak in DOM operations. Agents think in tasks.

AgentBrowser inverts this. The browser speaks the agent's language.


The Core Difference

Every other tool:

// Agent receives 8,000 tokens of HTML noise
// Agent guesses what "#submit-btn-v2" means
// Agent issues a DOM command and hopes
await page.click('#submit-btn-v2')
await page.fill('input[name="email"]', email)

AgentBrowser:

// Agent receives 50 tokens of structured meaning
// Agent sees exactly what it can do
// Agent calls a semantic action
const state = await browser.navigate('https://example.com/login')
// state.page_type        → 'login'
// state.available_actions → ['authenticate', 'signup', 'forgot_password']
// state.key_data         → { site: 'Example', sso_available: true }

await browser.action('authenticate', { email, password })
// result.state_change.summary → 'Navigated from login to dashboard'
// result.next_available_actions → ['view_profile', 'settings', 'logout']

Features

Semantic observation — Agents never see HTML. Every page navigation returns a structured model: what type of page it is, what data it contains, what actions are available. Token cost drops ~95%.

Dynamic tool registry — The available tools change as the page changes. On a login page, authenticate() appears. On a checkout page, submit_order() appears. The agent always sees exactly what it can do — nothing more.

Site memory — The browser learns permanently. First visit to a site: full LLM analysis. Second visit: cache hit, 7x faster, zero LLM cost. Tenth visit: the LLM gets injected context of proven selectors and known page flows, producing more accurate results from the start.

Self-healing execution — CAPTCHA detection, stale selector recovery, and post-action state verification are handled silently. Agents see task outcomes, not infrastructure failures.

Bot detection bypass — Ships with playwright-extra + stealth plugin. Bypasses Cloudflare challenges, cookie consent banners (OneTrust, Cookiebot, GDPR), and signup modals automatically on every navigation. No configuration needed.

Session persistence — Sessions are first-class objects. Save, restore, and branch browser state across agent runs. Auth state survives restarts.

MCP server — Ships as a Model Context Protocol server. Any MCP-compatible agent (Claude Code, LangChain, AutoGen, custom) connects without integration work.

Parallel tasks — Declare goals, not tabs. The runtime manages contexts, isolation, and result aggregation.


Tested Sites

Results from parallel 7-site test (14s total):

Site Page Type Detected Notes
news.ycombinator.com listing Actions execute (navigate, read, submit)
en.wikipedia.org article Cookie banner auto-dismissed
github.com landing Full action set available
stackoverflow.com listing Signup modal stripped before analysis
www.bbc.com listing OneTrust consent auto-dismissed
www.theverge.com listing Cloudflare challenge bypassed
www.reddit.com blocked IP-level network block — no client-side bypass

Installation

git clone https://github.com/AshtonVaughan/agentbrowser
cd agentbrowser
npm install
npx playwright install chromium
npm run build

Set your Anthropic API key:

cp .env.example .env
# Edit .env and set ANTHROPIC_API_KEY

Usage

As a library

import { AgentBrowser } from './src/index.js'

const browser = new AgentBrowser({
  anthropic_api_key: process.env.ANTHROPIC_API_KEY,
  headless: true,
  stealth: true,
})

await browser.launch()

// Navigate — returns semantic model, not HTML
const state = await browser.navigate('https://news.ycombinator.com')
console.log(state.page_type)          // 'listing'
console.log(state.available_actions)  // ['navigate_to_new', 'navigate_to_ask', ...]
console.log(state.key_data)           // { story_count: 30, top_story: '...' }

// Execute actions by name — actually clicks/navigates
await browser.action('navigate_to_new')
// → browser navigates to https://news.ycombinator.com/newest

// Extract structured data
const data = await browser.extract({
  top_story: 'title of the #1 story',
  points: 'upvote count of top story',
  author: 'username of top story submitter',
})

// Save session (auth state, cookies)
await browser.saveSession('hn-logged-in')

// Restore later
await browser.restoreSession('hn-logged-in')

await browser.close()

Run multiple sites in parallel

const results = await browser.executor.runParallel([
  { id: 'hn',   goal: 'get top stories', url: 'https://news.ycombinator.com' },
  { id: 'bbc',  goal: 'get headlines',

Tools (3)

navigateNavigates to a URL and returns a semantic model of the page including page type and available actions.
actionExecutes a semantic action on the current page based on available tools.
extractExtracts structured data from the current page based on provided schema requirements.

Environment Variables

ANTHROPIC_API_KEYrequiredAPI key for Anthropic to power the semantic analysis engine.

Configuration

claude_desktop_config.json
{"mcpServers": {"agentbrowser": {"command": "npx", "args": ["-y", "@ashtonvaughan/agentbrowser"], "env": {"ANTHROPIC_API_KEY": "your-key-here"}}}}

Try it

Navigate to news.ycombinator.com and extract the top 5 stories with their titles and points.
Go to github.com and identify the available actions for navigating to a repository.
Perform a search on a site and use the dynamic action tool to authenticate if a login is required.
Extract the headlines from the BBC homepage and summarize the key topics.

Frequently Asked Questions

What are the key features of AgentBrowser?

Semantic observation that replaces HTML with structured page models. Dynamic tool registry that updates available actions based on page context. Built-in site memory for faster subsequent visits and reduced LLM costs. Automated bot detection bypass using playwright-extra and stealth plugins. Session persistence to maintain authentication state across agent runs.

What can I use AgentBrowser for?

Automating complex web workflows that require authentication and state management. Efficiently scraping structured data from dynamic websites without parsing raw HTML. Building AI agents that can interact with modern web applications as a human would. Reducing token usage and costs for web-based agent tasks.

How do I install AgentBrowser?

Install AgentBrowser by running: git clone https://github.com/AshtonVaughan/agentbrowser && cd agentbrowser && npm install && npx playwright install chromium && npm run build

What MCP clients work with AgentBrowser?

AgentBrowser works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep AgentBrowser docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare