š Production MCP Web Scraper Server
A modular, production-ready MCP server built with the official MCP Python SDK. Optimized for Render deployment with clean separation of concerns.
š Project Structure
mcp-web-scraper/
āāā server.py # Main server entry point
āāā tools/
ā āāā __init__.py # Tools package initialization
ā āāā search.py # Search tools (web_search, news_search, etc.)
ā āāā scraping.py # Scraping tools (scrape_html, extract_article, etc.)
āāā utils/
ā āāā __init__.py # Utils package initialization
ā āāā helpers.py # Helper functions (clean_text, validate_url)
āāā requirements.txt # Python dependencies
āāā render.yaml # Render deployment configuration
āāā .gitignore # Git ignore rules
āāā README.md # This file
āāā config.example.json # Claude Desktop config example
⨠Features
š Search Tools (`tools/search.py`)
- web_search - DuckDuckGo web search
- news_search - News articles with metadata
- search_and_scrape - Search + content extraction
- smart_search - Adaptive search (quick/standard/comprehensive)
š Scraping Tools (`tools/scraping.py`)
- scrape_html - HTML scraping with CSS selectors
- extract_article - Clean article extraction
- extract_links - Link extraction with filtering
- extract_metadata - Page metadata & Open Graph
- scrape_table - Table data extraction
š Quick Deploy to Render
Step 1: Create Project Structure
mkdir mcp-web-scraper
cd mcp-web-scraper
# Create directory structure
mkdir -p tools utils
# Create all files (copy from artifacts above):
# - server.py
# - tools/__init__.py
# - tools/search.py
# - tools/scraping.py
# - utils/__init__.py
# - utils/helpers.py
# - requirements.txt
# - render.yaml
# - .gitignore
# - README.md
Step 2: Push to GitHub
git init
git add .
git commit -m "Initial commit: Modular MCP Web Scraper"
git remote add origin https://github.com/YOUR_USERNAME/mcp-web-scraper.git
git push -u origin main
Step 3: Deploy on Render
- Go to render.com
- Click "New +" ā "Web Service"
- Connect your GitHub repository
- Render auto-detects
render.yaml - Click "Create Web Service"
- Wait 2-3 minutes āØ
Step 4: Get Your URL
Your service: https://your-app.onrender.com
MCP endpoint: https://your-app.onrender.com/mcp
š Connect to Claude Desktop
Config Location
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
Configuration
{
"mcpServers": {
"web-scraper": {
"type": "streamable-http",
"url": "https://your-app.onrender.com/mcp"
}
}
}
Restart Claude Desktop after updating config!
š» Local Development
# Clone and setup
git clone https://github.com/YOUR_USERNAME/mcp-web-scraper.git
cd mcp-web-scraper
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run server
python server.py
Server runs at http://localhost:8000/mcp
Test Locally
# List tools
curl -X POST http://localhost:8000/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'
# Test web search
curl -X POST http://localhost:8000/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc":"2.0",
"id":2,
"method":"tools/call",
"params":{
"name":"web_search",
"arguments":{"query":"AI news","max_results":3}
}
}'
š ļø Adding New Tools
1. Search Tool Example
Edit tools/search.py:
@mcp.tool()
def my_custom_search(query: str) -> dict:
"""Your custom search tool"""
# Implementation here
return {"success": True, "data": []}
2. Scraping Tool Example
Edit tools/scraping.py:
@mcp.tool()
def my_custom_scraper(url: str) -> dict:
"""Your custom scraper"""
# Implementation here
return {"success": True, "content": ""}
3. Deploy Changes
git add .
git commit -m "Add new tools"
git push origin main
# Render auto-deploys!
š Monitoring
View Logs
- Render Dashboard ā Your Service
- Click "Logs" tab
- View real-time logs
Health Check
curl https://your-app.onrender.com/health
šÆ Architecture Benefits
ā Modular Design
- Separation of concerns - Each file has one responsibility
- Easy to maintain - Find and update code quickly
- Scalable - Add new tools without touching existing code
ā Clean Code
- Type hints - Better IDE support and error catching
- Logging - Track all operations
- Error handling - Graceful failures with detailed errors
ā Production Ready
- Official MCP SDK - FastMCP framework
- Streamable HTTP - Single endpoint communicatio
Tools 9
web_searchPerforms a DuckDuckGo web search.news_searchSearches for news articles with metadata.search_and_scrapePerforms a search and extracts content from results.smart_searchAdaptive search with quick, standard, or comprehensive modes.scrape_htmlScrapes HTML content using CSS selectors.extract_articleCleans and extracts article content from a webpage.extract_linksExtracts links from a webpage with filtering options.extract_metadataExtracts page metadata and Open Graph tags.scrape_tableExtracts table data from a webpage.