Scientific Paper Harvester MCP Server

1

Add it to Claude Code

Run this in a terminal.

Run in terminal
claude mcp add scientific-papers -- npx -y @futurelab-studio/latest-science-mcp@latest
README.md

Real-time access to over 200 million scientific papers from 6 academic sources.

Scientific Paper Harvester MCP Server

A comprehensive Model Context Protocol (MCP) server that provides LLMs with real-time access to scientific papers from 6 major academic sources: arXiv, OpenAlex, PMC (PubMed Central), Europe PMC, bioRxiv/medRxiv, and CORE.

🚀 Features

**Comprehensive Source Coverage**

  • arXiv: Computer science, physics, mathematics preprints and papers
  • OpenAlex: Open catalog of scholarly papers with citation data
  • PMC: PubMed Central biomedical and life science literature
  • Europe PMC: European life science literature database
  • bioRxiv/medRxiv: Biology and medical preprint servers
  • CORE: World's largest collection of open access research papers

**Advanced Capabilities**

  • Paper Fetching: Get latest papers from any source by category/concept
  • Paper Search: Search papers by title, abstract, author, or full-text across 4 major sources
  • Full-Text Extraction: Extract complete text content with intelligent fallback strategies
  • Citation Analysis: Find top cited papers from OpenAlex since a specific date
  • Paper Lookup: Retrieve full metadata for specific papers by ID
  • Category Discovery: Browse available categories from all sources
  • Smart Rate Limiting: Respectful API usage with per-source rate limiting
  • DOI Resolution: Advanced DOI resolver with Unpaywall → Crossref → Semantic Scholar fallback
  • Dual Interface: Both MCP protocol and CLI access
  • TypeScript: Full type safety with ESM modules

📊 Coverage Statistics

  • Total Sources: 6 academic databases
  • Category Coverage: 100+ categories across all disciplines
  • Paper Access: 200M+ papers with intelligent text extraction
  • Text Extraction Success: >90% for supported paper types
  • Response Time: <15 seconds average for paper fetching

🛠 Installation

npm install
npm run build

📋 MCP Client Configuration

To use this server with an MCP client (like Claude Desktop), add the following to your MCP client configuration:

For published package (available on npm):

Option 1: Using npx (recommended for AI tools like Claude)

{
  "mcpServers": {
    "scientific-papers": {
      "command": "npx",
      "args": [
        "-y",
        "@futurelab-studio/latest-science-mcp@latest"
      ]
    }
  }
}

Option 2: Global installation

npm install -g @futurelab-studio/latest-science-mcp

Then configure:

{
  "mcpServers": {
    "scientific-papers": {
      "command": "latest-science-mcp"
    }
  }
}

📖 Usage

CLI Interface

List Categories
# List arXiv categories
node dist/cli.js list-categories --source=arxiv

# List OpenAlex concepts
node dist/cli.js list-categories --source=openalex

# List PMC biomedical categories
node dist/cli.js list-categories --source=pmc

# List Europe PMC life science categories
node dist/cli.js list-categories --source=europepmc

# List bioRxiv/medRxiv categories (includes both servers)
node dist/cli.js list-categories --source=biorxiv

# List CORE academic categories
node dist/cli.js list-categories --source=core
Fetch Latest Papers
# Get latest AI papers from arXiv
node dist/cli.js fetch-latest --source=arxiv --category=cs.AI --count=10

# Get latest biology papers from bioRxiv
node dist/cli.js fetch-latest --source=biorxiv --category="biorxiv:biology" --count=5

# Get latest immunology papers from PMC
node dist/cli.js fetch-latest --source=pmc --category=immunology --count=3

# Get latest papers from CORE by subject
node dist/cli.js fetch-latest --source=core --category=computer_science --count=5

# Search by concept name (OpenAlex)
node dist/cli.js fetch-latest --source=openalex --category="machine learning" --count=3
Fetch Top Cited Papers
# Get top 20 cited papers in machine learning since 2024
node dist/cli.js fetch-top-cited --concept="machine learning" --since=2024-01-01 --count=20

# Get top cited papers by concept ID
node dist/cli.js fetch-top-cited --concept=C41008148 --since=2023-06-01 --count=10
Search Papers
# Search by keywords across all fields
node dist/cli.js search-papers --source=arxiv --query="machine learning" --count=10

# Search by paper title
node dist/cli.js search-papers --source=openalex --query="neural networks" --field=title --count=5

# Search by author name
node dist/cli.js search-papers --source=europepmc --query="John Smith" --field=author --count=10

# Search full-text content sorted by citations
node dist/cli.js search-papers --source=core --query="climate change" --field=fulltext --sortBy=citations --count=20
Fetch Specific Paper Content
# Get arXiv paper by ID
node dist/cli.js fetch-content --source=arxiv --id=2401.12345

# Get bioRxiv paper by DOI
node dist/cli.js fetch-content --source=biorxiv --id="10.1101/2021.01.01.425001"

# Get PMC paper by ID
node dist/cli.js fetch-content --source=pmc --id=PMC8245678

# Get

Tools (5)

list-categoriesList available categories or concepts from a specific academic source.
fetch-latestFetch the latest papers from a source by category or concept.
fetch-top-citedGet top cited papers from OpenAlex based on a concept and date.
search-papersSearch for papers across sources by query, field, or citation count.
fetch-contentRetrieve full metadata and content for a specific paper by ID or DOI.

Configuration

claude_desktop_config.json
{"mcpServers": {"scientific-papers": {"command": "npx", "args": ["-y", "@futurelab-studio/latest-science-mcp@latest"]}}}

Try it

Find the 5 most recent papers on machine learning from arXiv.
Search for papers authored by John Smith in the Europe PMC database.
Get the top 10 most cited papers in the field of climate change since 2023.
Retrieve the full content for the paper with DOI 10.1101/2021.01.01.425001.
List all available biomedical categories in PubMed Central.

Frequently Asked Questions

What are the key features of Scientific Paper Harvester?

Access to 6 major academic databases including arXiv, OpenAlex, and PMC.. Full-text extraction with intelligent fallback strategies.. Advanced citation analysis and top-cited paper discovery.. Search capabilities across title, abstract, author, and full-text fields.. DOI resolution with multi-source fallback..

What can I use Scientific Paper Harvester for?

Automating literature reviews by fetching the latest preprints in specific research fields.. Analyzing citation trends for specific scientific concepts over time.. Quickly retrieving full-text content for academic papers during research synthesis.. Discovering relevant research papers across multiple databases using a unified interface..

How do I install Scientific Paper Harvester?

Install Scientific Paper Harvester by running: npx -y @futurelab-studio/latest-science-mcp@latest

What MCP clients work with Scientific Paper Harvester?

Scientific Paper Harvester works with any MCP-compatible client including Claude Desktop, Claude Code, Cursor, and other editors with MCP support.

Turn this server into reusable context

Keep Scientific Paper Harvester docs, env vars, and workflow notes in Conare so your agent carries them across sessions.

Need the old visual installer? Open Conare IDE.
Open Conare