PDF2ZH MCP Server

Translate scientific PDF documents while preserving formulas, charts, and layout.

README.md

PDF2ZH — FeatherFlow MCP Server for PDF Translation

A slim MCP (Model Context Protocol) tool server that translates scientific PDF documents while preserving formulas, charts, table of contents, and layout. Designed to be launched and managed by FeatherFlow.

Based on PDFMathTranslate, stripped down to a single OpenAI-compatible translation backend — reusing the same LLM that FeatherFlow is already connected to.

Features

  • MCP stdio transport — plug-and-play with FeatherFlow (or any MCP-compatible host)
  • Preserves formulas & layout — powered by ONNX-based document layout analysis + pdfminer/pymupdf
  • Dual output — generates both mono (translated-only) and dual (bilingual side-by-side) PDFs
  • Shares FeatherFlow's LLM — OpenAI-compatible endpoint via environment variables, no extra API key needed
  • Cross-platform — works on Linux and Windows; all paths use pathlib for portability

MCP Tools

Tool Description
translate_pdf Translate a PDF file. Accepts file, lang_in, lang_out, optional output_dir. Returns absolute paths to mono & dual PDFs.
list_supported_languages List all supported language codes (en, zh, ja, ko, fr, de, etc.)

File Path Resolution

The file parameter of translate_pdf supports both absolute and relative paths:

  • Absolute path — used as-is (e.g. /home/user/.featherflow/workspace/paper.pdf)
  • Relative path — resolved against the workspace directory (defaults to ~/.featherflow/workspace, overridable via the WORKSPACE_DIR environment variable)

Output PDFs are written to the workspace directory by default. The returned paths are always absolute, making them directly usable by other MCP tools (e.g. feishu-mcp upload_file / upload_file_and_share).

Requirements

⚠️ Python Environment Isolation — Important

This project depends on babeldoc / onnxruntime, which require Python ≥3.10, <3.13. FeatherFlow itself may run on a different Python version (e.g. 3.13+). You must create a separate Python environment for this project and point FeatherFlow's MCP config to this project's Python executable — not FeatherFlow's own Python.

  • Python 3.10 – 3.12 (recommended: 3.12)
  • uv (recommended), Conda, or virtualenv for environment isolation

Installation

0. Install uv (one-time setup, recommended)

uv can automatically download and manage any Python version — no need to install Python 3.12 manually.

# Linux / macOS
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows PowerShell
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Restart your terminal after installation, then verify:

uv --version

1. Create a dedicated Python 3.12 environment

Using uv (recommended — auto-downloads Python 3.12 even if you only have 3.13+):

cd /path/to/pdftranslate-mcp
uv venv .venv --python 3.12

Activate the environment:

# Linux / macOS
source .venv/bin/activate

# Windows PowerShell
.venv\Scripts\Activate.ps1

# Windows CMD
.venv\Scripts\activate.bat

Alternative: Conda

conda create -p /path/to/pdftranslate-mcp/.venv python=3.12 -y
conda activate /path/to/pdftranslate-mcp/.venv

Alternative: venv (only if system Python is already 3.10–3.12)

python3.12 -m venv .venv
source .venv/bin/activate        # Linux/macOS
# .venv\Scripts\Activate.ps1     # Windows PowerShell

2. Install the package

pip install -e .

Or with uv (10-100x faster):

uv pip install -e .

This installs all dependencies: pymupdf, pdfminer-six, babeldoc, onnxruntime, openai, mcp, etc.

3. Verify

python -m pdf2zh.mcp_server --help

FeatherFlow Configuration

Edit ~/.featherflow/config.json (or the config file for your setup). Add pdf2zh under tools.mcpServers.

Key point: The command must point to this project's own Python executable, not FeatherFlow's Python. This project requires Python <3.13, while FeatherFlow may run on a newer version.

Example (Linux — production server)

{
  "tools": {
    "mcpServers": {
      "pdf2zh": {
        "command": "/opt/PDFMathTranslate/.venv/bin/python",
        "args": ["-m", "pdf2zh.mcp_server"],
        "toolTimeout": 600,
        "env": {
          "OPENAI_BASE_URL": "https://openrouter.ai/api/v1",
          "OPENAI_API_KEY": "sk-or-v1-xxxxxxxx",
          "OPENAI_MODEL": "anthropic/claude-opus-4-5"
        }
      }
    }
  }
}

Example (Windows — development)

{
  "tools": {
    "mcpServers": {
      "pdf2zh": {
        "command": "C:/Users/<you>/code/PDFMathTranslate/.venv/python.exe",
        "args": ["-m", "pdf2zh.mcp_server"],
        "toolTimeo

Tools 2

translate_pdfTranslate a PDF file while preserving formulas and layout.
list_supported_languagesList all supported language codes.

Environment Variables

OPENAI_BASE_URLrequiredThe base URL for the OpenAI-compatible translation backend.
OPENAI_API_KEYrequiredThe API key for the translation backend.
OPENAI_MODELrequiredThe specific model to use for translation.
WORKSPACE_DIRThe directory used for resolving relative file paths.

Try it

Translate the paper 'quantum_physics.pdf' from English to Chinese.
What are the supported language codes for document translation?
Translate 'research_report.pdf' from Japanese to English and save it in my workspace.
Can you translate this PDF into a bilingual side-by-side format?

Frequently Asked Questions

What are the key features of PDF2ZH?

Preserves complex formulas, charts, and document layout during translation. Generates both mono (translated-only) and dual (bilingual side-by-side) PDF outputs. Uses OpenAI-compatible LLM backends for high-quality translation. Supports absolute and relative file path resolution via workspace configuration. Cross-platform compatibility for Linux and Windows.

What can I use PDF2ZH for?

Translating academic research papers while maintaining mathematical notation integrity. Creating bilingual reference documents for international collaboration. Automating the translation of technical manuals and reports within a workflow. Processing scientific documents for non-native speakers without losing visual formatting.

How do I install PDF2ZH?

Install PDF2ZH by running: pip install -e .

What MCP clients work with PDF2ZH?

PDF2ZH works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep PDF2ZH docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Open Conare