What are the requirements for Fetch Guard?

Fetch Guard requires a compatible MCP client such as Claude Desktop, Claude Code, or Cursor. No additional environment variables are needed for basic setup.

Is Fetch Guard free to use?

Yes, Fetch Guard is open source and free to use. You can find the source code on GitHub.

What MCP clients support Fetch Guard?

Fetch Guard works with any MCP-compatible client including Claude Desktop (Anthropic's official desktop app), Claude Code (CLI tool), Cursor, and other editors with MCP support.

How do I configure Fetch Guard?

Configure Fetch Guard by adding it to your MCP client's config file. The setup block at the top of this page generates a ready-to-paste config for Claude Code, Cursor, Codex, Windsurf, and Claude Desktop.

Fetch Guard MCP Server

Q: What tools does Fetch Guard provide?

fetch: Fetches a URL and returns clean, LLM-ready markdown with metadata and security sanitization..

Q: How do I install Fetch Guard?

Install Fetch Guard by running: uvx fetch-guard

Fetch URLs and return clean, LLM-ready markdown with prompt injection defense.

browser-automation web-scraping security markdown prompt-injection

README.md

Fetch Guard

An MCP server and CLI tool that fetches URLs and returns clean, LLM-ready markdown. A purpose-built extraction pipeline sanitizes HTML, pulls structured metadata, detects prompt injection attempts, and handles the edge cases that break naive fetchers: bot blocks, paywalls, login walls, non-HTML content types, and pages that require JavaScript to render.

The core problem is straightforward: LLMs need web content, but raw HTML is noisy and potentially hostile. Fetched pages can contain hidden text, invisible Unicode, off-screen elements, and outright prompt injection attempts embedded in the content itself. This pipeline strips all of that before the content reaches the model.

Three layers handle the injection defense specifically:

Pre-extraction sanitization removes hidden elements (display:none, visibility:hidden, opacity:0, font-size:0, transform:scale(0), clip:rect(0,0,0,0), zero-height overflow containers, and elements with matching foreground and background colors), elements hidden via CSS class/ID rules in <style> tags, off-screen positioned content, aria-hidden elements, <noscript> and <template> tags, and 26 categories of non-printing Unicode characters including bidi isolates and Unicode Tags. This happens before content extraction, so trafilatura never sees the attack vectors.
Pattern scanning runs a four-phase scan against the extracted text and metadata fields. Phase one applies 50 compiled regex patterns covering system prompt overrides, ignore-previous instructions, role injection, fake conversation tags, and hidden instruction markers, in English, Spanish, French, German, Japanese, Simplified Chinese, and Portuguese. Phase two normalizes the text via NFKC and confusable-character mapping, then rescans to catch homoglyph bypasses (Cyrillic or mathematical Unicode characters substituted for Latin, etc.). Phase three finds base64, hex-encoded, and URL percent-encoded blocks, decodes them, and scans against high-severity patterns. Phase four decodes the full document with ROT13 and scans against high-severity patterns. Metadata fields (title, description, og:title, etc.) are scanned independently with matches namespaced to their source field.
Session-salted output wrapping generates a random 8-character hex salt per invocation and wraps the body in <fetch-content-{salt}> tags. Since the salt is unpredictable, injected content cannot spoof the wrapper boundaries.

One Tool

This is a single-tool MCP server. It exposes one tool — fetch — that runs a full extraction pipeline behind a consistent interface. No tool selection, no routing, no multi-step workflows. One URL in, one structured result out, configurable via parameters.

Quick Start

Prerequisites

Python 3.10+
pip

Install

pip install fetch-guard

For JavaScript rendering (optional):

pip install 'fetch-guard[js]' && playwright install chromium

Configure Your MCP Client

Add the following to your MCP client config. Works with Claude Code, Claude Desktop, Cursor, or any MCP-compatible client.

Via uvx (recommended):

{
  "mcpServers": {
    "fetch-guard": {
      "command": "uvx",
      "args": ["fetch-guard"]
    }
  }
}

Via pip install:

{
  "mcpServers": {
    "fetch-guard": {
      "command": "fetch-guard"
    }
  }
}

From source:

{
  "mcpServers": {
    "fetch-guard": {
      "command": "python",
      "args": ["-m", "fetch_guard.server"]
    }
  }
}

Via Docker:

{
  "mcpServers": {
    "fetch-guard": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "sterlsnyc/fetch-guard"]
    }
  }
}

Note: The Docker image does not include Playwright. JavaScript rendering (js: true) is not available when running via Docker. Use the uvx or pip install if you need JS rendering.

Verify

Ask your AI assistant to fetch any URL. If it returns structured content with a status header, metadata, and risk assessment, you're connected.

CLI

fetch-guard-cli <url> [options]
# or: python -m fetch_guard.cli <url> [options]

Flag	Default	Description
`--timeout N`	180	Request timeout in seconds
`--max-words N`	none	Word cap on extracted body content. Also disables th

Tools 1

fetchFetches a URL and returns clean, LLM-ready markdown with metadata and security sanitization.

Try it

→Fetch the content of https://example.com and summarize it for me.

→Can you scrape the article at https://news.ycombinator.com/item?id=12345 and extract the main points?

→Fetch the documentation page at https://docs.example.com/api using JavaScript rendering to ensure all content is captured.

→Retrieve the content from this URL and ensure it is sanitized for prompt injection risks.

Frequently Asked Questions

What are the key features of Fetch Guard?

Pre-extraction sanitization of hidden elements and non-printing Unicode characters. Multi-phase pattern scanning to detect and block prompt injection attempts. Session-salted output wrapping to prevent boundary spoofing. Support for JavaScript rendering via Playwright. Configurable timeouts and word limits for extracted content.

What can I use Fetch Guard for?

Safely ingesting web content into LLMs without risk of prompt injection. Converting complex or noisy web pages into clean, structured markdown. Automating research tasks by scraping and summarizing multiple URLs. Extracting metadata and content from sites that require JavaScript to render.

How do I install Fetch Guard?

Install Fetch Guard by running: uvx fetch-guard

What MCP clients work with Fetch Guard?

Fetch Guard works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep Fetch Guard docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Open Conare