Automate Browser-based workflows using LLMs and Computer Vision
<source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_logo.png"/>
<source media="(prefers-color-scheme: dark)" srcset="fern/images/skyvern_logo.png"/>
š Automate Browser-based workflows using LLMs and Computer Vision š
Skyvern automates browser-based workflows using LLMs and computer vision. It provides a Playwright-compatible SDK that adds AI functionality on top of playwright, as well as a no-code workflow builder to help both technical and non-technical users automate manual workflows on any website, replacing brittle or unreliable automation solutions.
Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed.
Instead of only relying on code-defined XPath interactions, Skyvern relies on Vision LLMs to learn and interact with the websites.
How it works
Skyvern was inspired by the Task-Driven autonomous agent design popularized by BabyAGI and AutoGPT -- with one major bonus: we give Skyvern the ability to interact with websites using browser automation libraries like Playwright.
Skyvern uses a swarm of agents to comprehend a website, and plan and execute its actions:
This approach has a few advantages:
- Skyvern can operate on websites it's never seen before, as it's able to map visual elements to actions necessary to complete a workflow, without any customized code
- Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate
- Skyvern is able to take a single workflow and apply it to a large number of websites, as it's able to reason through the interactions necessary to complete the workflow A detailed technical report can be found here.
Demo
https://github.com/user-attachments/assets/5cab4668-e8e2-4982-8551-aab05ff73a7f
Quickstart
Skyvern Cloud
Skyvern Cloud is a managed cloud version of Skyvern that allows you to run Skyvern without worrying about the infrastructure. It allows you to run multiple Skyvern instances in parallel and comes bundled with anti-bot detection mechanisms, proxy network, and CAPTCHA solvers.
If you'd like to try it out, navigate to app.skyvern.com and create an account.
Run Locally (UI + Server)
Choose your preferred setup method:
Option A: pip install (Recommended)
Dependencies needed:
- Python 3.11.x, works with 3.12, not ready yet for 3.13
- NodeJS & NPM
Additionally, for Windows:
- Rust
- VS Code with C++ dev tools and Windows SDK
1. Install Skyvern
pip install skyvern
2. Run Skyvern
skyvern quickstart
Option B: Docker Compose
- Install Docker Desktop
- Clone the repository:
git clone https://github.com/skyvern-ai/skyvern.git && cd skyvern - Run quickstart with Docker Compose:
When prompted, choose "Docker Compose"pip install skyvern && skyvern quickstart
Tools (3)
navigate_to_urlInstruct the browser to navigate to a specific website URL.execute_workflowRun a multi-step automation workflow on a website using natural language instructions.extract_dataExtract structured data from the current page based on provided schema.Environment Variables
SKYVERN_API_KEYAPI key for accessing Skyvern cloud services if applicable.Configuration
{"mcpServers": {"skyvern": {"command": "skyvern", "args": ["--mcp"]}}}