MCP Podcast Scraper
An MCP (Model Context Protocol) server that scrapes and transcribes podcast episodes. Designed to work with Claude Code or Claude Desktop - you provide the podcast, the MCP transcribes it, and Claude summarizes it.
What It Does
- šļø Scrapes podcasts from YouTube videos or RSS feeds
- šÆ Transcribes audio using Deepgram's fast Nova-2 model
- š Organizes files by podcast name and episode date
- š Tracks podcasts for new episodes
- āļø Skips duplicates - won't re-scrape already processed episodes
- š Finds incomplete work - lists episodes that need summarization
- āļø Custom summary prompts - customize how Claude summarizes for your needs
How It Works
You: "Check for new episodes and summarize them"
ā
Claude: Calls check_new_episodes() ā Finds new episodes
ā
Claude: Calls scrape_podcast() ā Downloads & transcribes
ā
Claude: Calls get_summary_prompt() ā Reads your custom instructions
ā
Claude: Calls get_transcript() ā Reads the transcript
ā
Claude: Summarizes following your prompt
ā
Claude: Calls save_summary() ā Saves the .md file
ā
Done! transcript.md + summary.md saved
Installation Guide
Step 1: Prerequisites
Install required system tools (macOS):
# Install Homebrew if you don't have it
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Install yt-dlp (for YouTube) and ffmpeg (for audio)
brew install yt-dlp ffmpeg
Step 2: Clone & Build
# Clone the repository
git clone https://github.com/wkoleilat-happytitan/mcp-podcast-scraper.git
cd mcp-podcast-scraper
# Install dependencies
npm install
# Build
npm run build
Step 3: Get a Deepgram API Key
- Go to https://console.deepgram.com/
- Sign up (free tier includes $200 credit - enough for ~300 hours of audio)
- Create an API key
- Copy the key
Step 4: Configure
Copy the example config file and add your API key:
# Copy the example config
cp config.example.json config.json
# Edit config.json and add your Deepgram API key
Your config.json should look like:
{
"outputDirectory": "./podcasts",
"deepgramApiKey": "YOUR_ACTUAL_DEEPGRAM_API_KEY",
"tempDirectory": "./temp"
}
ā ļø Important: Never commit
config.jsonto git - it contains your API key! The.gitignorealready excludes it.
Step 5: Add to Claude Code
Add this to your Claude Code MCP settings (~/.cursor/mcp.json or via Settings ā MCP):
{
"mcpServers": {
"podcast-scraper": {
"command": "node",
"args": ["/FULL/PATH/TO/mcp-podcast-scraper/dist/index.js"]
}
}
}
Important: Replace /FULL/PATH/TO/ with the actual path to your installation.
Step 5 (Alternative): Add to Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"podcast-scraper": {
"command": "node",
"args": ["/FULL/PATH/TO/mcp-podcast-scraper/dist/index.js"]
}
}
}
Then restart Claude Desktop.
File Structure
mcp-podcast-scraper/
āāā config.example.json # Template - copy to config.json
āāā config.json # Your config (git-ignored, contains API key)
āāā tracking.example.json # Example tracking file
āāā tracking.json # Your tracked podcasts (git-ignored)
āāā prompts/
ā āāā summary-prompt.md # Customize how Claude summarizes (editable)
āāā podcasts/ # Your transcripts & summaries (git-ignored)
āāā src/ # Source code
āāā dist/ # Compiled code (git-ignored)
āāā node_modules/ # Dependencies (git-ignored)
Usage Examples
Scrape a Specific Episode
"Scrape this YouTube podcast: https://youtube.com/watch?v=..."
"Find and scrape the latest Lex Fridman episode"
Track Podcasts for New Episodes
"Track the Huberman Lab podcast: https://feeds.megaphone.fm/hubermanlab"
"Check my tracked podcasts for new episodes"
"List all podcasts I'm tracking"
Find Incomplete Work
"Show me episodes that need summaries"
"List incomplete episodes"
MCP Tools Reference
| Tool | Description |
|---|---|
scrape_podcast |
Scrape & transcribe an episode. Returns file path and preview. |
get_transcript |
Read the full transcript of a scraped episode. |
get_summary_prompt |
Get your custom summarization instructions. |
save_summary |
Save your generated summary to a markdown file. |
check_new_episodes |
Check tracked podcasts for new (unscraped) episodes. |
list_incomplete |
Find episodes with transcripts but no summaries. |
search_podcast |
Search YouTube or parse RSS feeds to find episodes. |
add_tracking |
Add a podcast RSS feed to your tracking list. |
list_tracking |
List all podcasts you're tracking. |
remove_tracking |
Remove a podca |
Tools 10
scrape_podcastScrape and transcribe an episode, returning file path and preview.get_transcriptRead the full transcript of a scraped episode.get_summary_promptGet your custom summarization instructions.save_summarySave your generated summary to a markdown file.check_new_episodesCheck tracked podcasts for new unscraped episodes.list_incompleteFind episodes with transcripts but no summaries.search_podcastSearch YouTube or parse RSS feeds to find episodes.add_trackingAdd a podcast RSS feed to your tracking list.list_trackingList all podcasts you are tracking.remove_trackingRemove a podcast from your tracking list.Environment Variables
deepgramApiKeyrequiredAPI key for Deepgram transcription services