MCP Playwright Browser MCP Server

A production-grade MCP server that gives AI assistants full browser control.

README.md

MCP Playwright Browser Server

A production-grade Model Context Protocol (MCP) server that gives AI assistants full browser control through Playwright — using a hybrid DOM + Accessibility Tree + Visual approach. Built for real-world agentic automation: job applications, web scraping, form filling, and complex multi-tab workflows.

v2.0 is a complete rewrite. The server grew from 680 lines and 23 tools to nearly 5,000 lines and 71 tools, with a modular architecture, token-optimized capture profiles, hard payload budgets, and a full test suite.


What's New in v2.0

The Problem v1 Had

v1 was a working proof of concept. It could browse pages and extract jobs. But when used with Gemini CLI for real tasks — filling application forms, navigating multi-tab flows, handling downloads — it hit hard limits:

  • Token waste: Every tool response dumped everything it found. One browser.snapshot on a complex page could push 50KB+ into Gemini's context window in a single call, rapidly exhausting the budget.
  • No multi-tab support: If a link opened a new tab (very common in job applications), Gemini was stuck with no way to switch to it.
  • No form intelligence: Filling a form required manual click-by-click instructions. There was no way to ask "what fields are still empty?" or "fill all required fields."
  • Brittle DOM-only navigation: Shadow DOM, iframes, and obfuscated element IDs caused failures with no fallback.
  • No session persistence: Every run started fresh. Logging in again and again wasted time and triggered bot detection.
  • No safety rails: The AI could write files anywhere on disk, run arbitrary JS, or create its own automation scripts — unguarded.
  • Monolithic: One 680-line file with no tests.

What v2.0 Solves

Every one of those problems has a specific solution in v2.0:

Problem v2.0 Solution
Token waste Capture Profile System (light/balanced/full) + 280KB hard payload ceiling
Multi-tab stuck Page Manager with stable pageIds, browser.list_pages, browser.select_page
Dumb form filling browser.form_audit + browser.fill_form + Google Forms specialist tools
Shadow DOM / obfuscated IDs A11y tree via CDP Accessibility.getFullAXTree with stable ax- UIDs
Session loss Cookie export/import, browser.export_storage_state / browser.import_storage_state
No safety Path allowlist in src/security/paths.js, MCP_ALLOW_EVALUATE guard
Monolithic 10 focused modules in src/browser/ + src/security/ + 18-test suite

v1 vs v2 Comparison

Dimension v1.0 v2.0
Total MCP tools 23 71
Server size 680 lines, 1 file 4,966 lines, 11 modules
Token efficiency Uncontrolled dumps Capture profiles + 280KB hard ceiling
Multi-tab support Single tab only Full page manager (list, select, close)
Form automation Manual click-by-click form_audit + fill_form + Google Forms specialist
A11y / Shadow DOM DOM-only, brittle CDP Accessibility tree with stable UIDs
Scroll handling Saw first viewport only Scroll awareness + container scrolling
Session persistence None Cookie/storage export-import
Popup & dialog handling None Dialog accept/dismiss, popup pageId capture
Download management None Wait-for-download, save to path
File reading (CV/PDF) None files.read_text, files.read_pdf_text
Security No restrictions Allowlist-enforced read/write paths
Observability None Console log capture, network request log
Test coverage 2 tests 18 tests
Profiles 3 5 (+ persistent variants)
Batch scripts 5 .bat launchers 7 .bat launchers
Error handling Raw exceptions to AI Normalized, structured, budgeted

What stayed the same

  • Indeed job extractor (production-grade, multi-selector, deduplication)
  • Google search extractor (consent handling, URL deobfuscation)
  • Stealth mode (webdriver hiding, user agent spoofing)
  • CDP connection to real Chrome
  • Visual snapshot + coordinate-based clicking

How It Works

You / Gemini CLI
      │
      │ natural language prompt
      ▼
  Gemini CLI ──── loads MCP conf

Tools 5

browser.list_pagesLists all currently open browser pages.
browser.select_pageSwitches focus to a specific browser page by ID.
browser.form_auditAudits a form to identify fields and their requirements.
browser.fill_formFills out form fields based on provided data.
browser.export_storage_stateExports current cookies and local storage for session persistence.

Environment Variables

MCP_ALLOW_EVALUATEEnables or disables the ability to evaluate arbitrary JavaScript.

Try it

Navigate to the job application page, audit the form fields, and fill in my contact information.
Search for recent software engineering roles on Indeed and extract the job titles and links.
Open a new tab, go to the documentation website, and summarize the installation steps.
Export my current session cookies to ensure I stay logged in for the next automation task.

Frequently Asked Questions

What are the key features of MCP Playwright Browser?

Full browser control using Playwright with a hybrid DOM and Accessibility Tree approach.. Multi-tab support with stable page management and switching capabilities.. Advanced form automation including auditing and intelligent filling.. Session persistence via cookie and storage state export/import.. Token-optimized capture profiles to manage context window usage..

What can I use MCP Playwright Browser for?

Automating complex job application workflows across multiple web pages.. Performing structured web scraping with stealth mode to bypass bot detection.. Filling out long or complex web forms automatically.. Managing multi-tab browser sessions for research and data extraction tasks..

How do I install MCP Playwright Browser?

Install MCP Playwright Browser by running: npx -y mcp-playwright-browser

What MCP clients work with MCP Playwright Browser?

MCP Playwright Browser works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep MCP Playwright Browser docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Open Conare