MCP PDF Server MCP Server

1

Add it to Claude Code

Run this in a terminal.

Run in terminal
claude mcp add mcp-pdf-reader -- uvx mcp-pdf-reader
README.md

Extract text, perform OCR, and retrieve images from PDF files.

📄 MCP PDF Server

A PDF file reading server based on FastMCP.

Supports PDF text extraction, OCR recognition, and image extraction via the MCP protocol, with a built-in web debugger for easy testing.


🚀 Features

  • read_pdf_text
    Extracts normal text from a PDF (page by page).

  • read_by_ocr
    Uses OCR to recognize text from scanned or image-based PDFs.

  • read_pdf_images
    Extracts all images from a specified PDF page (Base64 encoded output).


📂 Project Structure

mcp-pdf-server/
├── pdf_server.py         # Main server entry point
└── README.md             # Project documentation

⚙️ Installation

Recommended Python version: 3.9+

pip install pymupdf mcp

Note: To use OCR features, you may need a MuPDF build with OCR support or external OCR libraries.

🤖 Configuration

{
  "mcpServers": {
    "pdf-reader": {
      "command": "uvx",
      "timeout": 60000,
      "args": [
        "mcp-pdf-reader"
      ]
    }
  }
}

🔦 Start the Server

Run the following command:

python pdf_server.py

You should see logs like:

INFO:mcp-pdf-server:Starting MCP PDF Server...

🛠️ API Tool List

Tool Description Input Parameters Returns
read_pdf_text Extracts normal text from PDF pages file_path, start_page, end_page List of page texts
read_by_ocr Recognizes text via OCR file_path, start_page, end_page, language, dpi OCR extracted text
read_pdf_images Extracts images from a PDF page file_path, page_number List of images (Base64 encoded)

📝 Example Usage

Extract text from pages 1 to 5:

mcp run read_pdf_text --args '{"file_path": "pdf_resources/example.pdf", "start_page": 1, "end_page": 5}'

Perform OCR recognition on page 1:

mcp run read_by_ocr --args '{"file_path": "pdf_resources/example.pdf", "start_page": 1, "end_page": 1, "language": "eng"}'

Extract all images from page 3:

mcp run read_pdf_images --args '{"file_path": "pdf_resources/example.pdf", "page_number": 3}'

📢 Notes

  • Files must be placed inside the pdf_resources/ directory, or an absolute path must be provided.
  • OCR functionality requires appropriate OCR support in the environment.
  • When processing large files, adjust memory and timeout settings as needed.

📜 License

This project is licensed under the MIT License.
For commercial use, please credit the original source.


Tools (3)

read_pdf_textExtracts normal text from PDF pages.
read_by_ocrRecognizes text via OCR for scanned or image-based PDFs.
read_pdf_imagesExtracts all images from a specified PDF page.

Configuration

claude_desktop_config.json
{"mcpServers": {"pdf-reader": {"command": "uvx", "timeout": 60000, "args": ["mcp-pdf-reader"]}}}

Try it

Extract the text from pages 1 to 5 of the document located at pdf_resources/report.pdf.
Perform OCR on page 1 of the scanned document at pdf_resources/invoice.pdf to get the text content.
Extract all images from page 3 of the PDF file located at pdf_resources/presentation.pdf.

Frequently Asked Questions

What are the key features of MCP PDF Server?

Page-by-page text extraction from standard PDF files. OCR recognition for scanned or image-based PDF documents. Base64 encoded image extraction from specific PDF pages. Built-in web debugger for testing server functionality.

What can I use MCP PDF Server for?

Automating data extraction from large batches of PDF reports. Converting scanned paper documents into machine-readable text using OCR. Extracting visual assets from PDF presentations for use in other documents. Integrating PDF content analysis directly into AI-assisted research workflows.

How do I install MCP PDF Server?

Install MCP PDF Server by running: pip install pymupdf mcp

What MCP clients work with MCP PDF Server?

MCP PDF Server works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep MCP PDF Server docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare