PDF MCP Server

1

Add it to Claude Code

Run this in a terminal.

Run in terminal
claude mcp add pdf-mcp -- uvx --from git+https://github.com/I-CAN-hack/pdf-mcp.git pdf-mcp
README.md

An MCP server for reading, rendering, and searching PDF files.

pdf-mcp

An MCP server for reading, rendering, and searching PDF files. Built with PyMuPDF and PyMuPDF4LLM.

Designed for use with LLMs that need to read datasheets and other PDFs containing diagrams, tables, and technical content.

Tools

Tool Description
get_pdf_info Get metadata about a PDF (page count, author, title, etc.)
get_table_of_contents Get the outline/bookmarks with page numbers for each section
get_page_text Extract text from a page range in json (default), text, markdown, or html format. Optionally exclude headers/footers
get_page_image Render a single page as a PNG image, returned as base64 or written to a temp file. Configurable DPI (default 150)
search_text Case-insensitive text search across the entire PDF, returning page numbers and surrounding context

All requests are stateless and take the PDF filename as a parameter.

Setup

Add the following to your .mcp.json:

{
  "mcpServers": {
    "pdf-mcp": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/I-CAN-hack/pdf-mcp.git", "pdf-mcp"]
    }
  }
}

Or for codex run codex mcp add pdf-mcp -- uvx --from git+https://github.com/I-CAN-hack/pdf-mcp.git pdf-mcp.

This will automatically install and run the server using uvx.

Development

# Install dependencies
uv sync

# Generate test PDFs
uv run python assets/generate.py

# Run tests
uv run pytest tests/ -v

# Run the server locally
uv run pdf-mcp

Tools (5)

get_pdf_infoGet metadata about a PDF (page count, author, title, etc.)
get_table_of_contentsGet the outline/bookmarks with page numbers for each section
get_page_textExtract text from a page range in json, text, markdown, or html format.
get_page_imageRender a single page as a PNG image, returned as base64 or written to a temp file.
search_textCase-insensitive text search across the entire PDF, returning page numbers and surrounding context

Configuration

claude_desktop_config.json
{"mcpServers": {"pdf-mcp": {"command": "uvx", "args": ["--from", "git+https://github.com/I-CAN-hack/pdf-mcp.git", "pdf-mcp"]}}}

Try it

Extract the text from pages 5 to 10 of the document 'manual.pdf' in markdown format.
Search for the term 'error code 404' in 'datasheet.pdf' and tell me which page it appears on.
Get the table of contents for 'report.pdf' to understand the document structure.
Render page 1 of 'diagram.pdf' as an image so I can analyze the technical drawing.
What is the author and total page count of 'research_paper.pdf'?

Frequently Asked Questions

What are the key features of PDF MCP?

Extract text in multiple formats including JSON, text, markdown, and HTML. Render PDF pages as PNG images with configurable DPI. Perform case-insensitive full-text search across PDF documents. Retrieve document metadata and table of contents. Optionally exclude headers and footers during text extraction.

What can I use PDF MCP for?

Analyzing technical datasheets by extracting diagrams and tables for LLM interpretation. Automating the extraction of specific sections from long PDF reports. Searching through large documentation sets to find specific technical references. Converting PDF pages into images for visual inspection by vision-capable models.

How do I install PDF MCP?

Install PDF MCP by running: uvx --from git+https://github.com/I-CAN-hack/pdf-mcp.git pdf-mcp

What MCP clients work with PDF MCP?

PDF MCP works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep PDF MCP docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare