MCP server for the DeepGHS anime AI ecosystem
deepghs-mcp
A Python MCP server for the DeepGHS anime AI ecosystem. Connect it to any MCP-compatible client (Claude Desktop, Cursor, etc.) to browse datasets, discover pre-built character training sets, look up tags across 18 platforms, and generate complete data pipeline scripts — all directly from your AI assistant.
✨ Features
📦 Dataset & Model Discovery
- Browse all DeepGHS datasets — Danbooru2024 (8M+ images), Sankaku, Gelbooru, Zerochan, BangumiBase, and more
- Full file trees — see exactly which tar/parquet files a dataset contains and how large they are before downloading anything
- Model catalog — find the right model for your task: CCIP, WD Tagger Enhanced, aesthetic scorer, face/head detector, anime classifier, and more
- Live demos — browse DeepGHS Spaces (interactive web apps) for testing models without code
🏷️ Cross-Platform Tag Intelligence
- site_tags lookup — 2.5M+ tags unified across 18 platforms in one query
- Tag format translation — Danbooru uses
hatsune_miku, Zerochan usesHatsune Miku, Pixiv uses初音ミク— this tool maps them all together - Ready-to-use Parquet queries — get copy-paste code to filter the tag database programmatically
🎯 Character Dataset Finder
- Pre-built LoRA datasets — search both
deepghsandCyberHaremnamespaces for existing character image collections - Ready-to-run download commands — get the exact
cheesechasercommand to pull what you need - Smart fallback — if no pre-built dataset exists, the tool hands off directly to the waifuc script generator
🤖 Training Pipeline Code Generation
- waifuc scripts — generate complete, annotated Python data collection pipelines for any character from any source (Danbooru, Pixiv, Gelbooru, Zerochan, Sankaku, or Auto)
- cheesechaser scripts — generate targeted download scripts to pull specific post IDs from indexed multi-TB datasets without downloading the whole archive
- Format-aware — crop sizes, bucket ranges, and export formats automatically adjusted for SD 1.5, SDXL, or Flux
📦 Installation
Prerequisites
- Python 3.10+
git
Quick Start
- Clone the repository:
git clone https://github.com/citronlegacy/deepghs-mcp.git
cd deepghs-mcp
- Run the installer:
chmod +x install.sh && ./install.sh
# or without chmod:
bash install.sh
- Or install manually:
pip install -r requirements.txt
🔑 Authentication
HF_TOKEN is optional for public datasets but strongly recommended — it raises HuggingFace's API rate limit and is required for any gated or private repositories.
Get your token at huggingface.co/settings/tokens (read access is sufficient).
Without it, the server still works for all public DeepGHS datasets.
▶️ Running the Server
python deepghs_mcp.py
# or via the venv created by install.sh:
.venv/bin/python deepghs_mcp.py
⚙️ Configuration
Claude Desktop
Add the following to your claude_desktop_config.json:
{
"mcpServers": {
"deepghs": {
"command": "/absolute/path/to/.venv/bin/python",
"args": ["/absolute/path/to/deepghs_mcp.py"],
"env": {
"HF_TOKEN": "hf_your_token_here"
}
}
}
}
Other MCP Clients
- Command:
/absolute/path/to/.venv/bin/python - Args:
/absolute/path/to/deepghs_mcp.py - Transport: stdio
💡 Usage Examples
Browse available datasets
"What anime datasets does DeepGHS have on HuggingFace?"
The assistant calls deepghs_list_datasets and returns all datasets sorted by download count — Danbooru2024, Sankaku, Gelbooru WebP, BangumiBase, site_tags, and more — with links and update dates.
Check dataset contents before downloading
"What files are in deepghs/danbooru2024? How big is it?"
The assistant calls deepghs_get_repo_info and returns the full file tree — every .tar and .parquet file with individual and total sizes — so you know exactly what you're committing to before you download.
Find a pre-built character dataset
"Is there already a dataset for Rem from Re:Zero I can use for LoRA training?"
The assistant calls deepghs_find_character_dataset, searches both deepghs and CyberHarem namespaces, and returns any matches with download counts and a one-liner download command.
Example response:
## Character Dataset Search: Rem
Found 2 dataset(s):
### CyberHarem/rem_
Tools (3)
deepghs_list_datasetsBrowse available DeepGHS datasets on HuggingFace.deepghs_get_repo_infoGet file tree and size information for a specific dataset repository.deepghs_find_character_datasetSearch for pre-built character LoRA datasets across namespaces.Environment Variables
HF_TOKENHuggingFace API token for accessing gated repositories and increasing rate limits.Configuration
{"mcpServers": {"deepghs": {"command": "/absolute/path/to/.venv/bin/python", "args": ["/absolute/path/to/deepghs_mcp.py"], "env": {"HF_TOKEN": "hf_your_token_here"}}}}