Research infrastructure for AI agents
BrowseAI Dev
Research infrastructure for AI agents — real-time web search, evidence extraction, and structured citations. Every claim is backed by a URL. Every answer has a confidence score.
Agent → BrowseAI Dev → Internet → Verified answers + sources
Website · Playground · API Docs · Alternatives · Discord
Package names: npm: `browseai-dev` · PyPI: `browseaidev` · LangChain: `langchain-browseaidev` — Previously
browse-aiandbrowseai. Old names still work and redirect automatically.
How It Works
search → fetch pages → neural rerank → extract claims → verify → cited answer (streamed)
Every answer goes through a multi-step verification pipeline. No hallucination. Every claim is backed by a real source.
Verification & Confidence Scoring
Confidence scores are evidence-based — not LLM self-assessed. After the LLM extracts claims and sources, a post-extraction verification engine checks every claim against the actual source page text:
- Atomic claim decomposition — Compound claims are auto-split into individual verifiable facts. "Tesla had $96B revenue and 1.8M deliveries" becomes two atomic claims, each verified independently.
- Hybrid retrieval (BM25 + dense embeddings) — For each claim, BM25 finds lexical matches and OpenAI
text-embedding-3-small(via OpenRouter) finds semantic matches from source text. Rankings are fused using Reciprocal Rank Fusion (RRF) — a rank-based method that avoids score normalization issues. This catches paraphrased evidence that BM25 alone misses (e.g., "prevents fabricated answers" matching "reduces hallucinations"). Premium tier only, with graceful BM25 fallback. - NLI evidence reranking — Top-3 RRF-fused candidates per claim are reranked by a DeBERTa-v3 NLI model for semantic entailment. Final hybrid score: 30% BM25 + 70% NLI, with contradiction penalties and paraphrase boosts.
- Multi-provider search — Parallel search across multiple providers for broader source diversity. More independent sources = stronger cross-reference = higher confidence.
- Domain authority scoring — 10,000+ domains across 5 tiers (institutional
.gov/.edu→ major news → tech journalism → community → low-quality), stored in Supabase with Majestic Million bulk import. Self-improving via Bayesian cold-start smoothing. - Source quote verification — LLM-extracted quotes verified against actual page text using hybrid matching (exact substring → BM25 fallback).
- Cross-source consensus — Each claim verified against all available page texts. Claims supported by 3+ independent domains get "strong consensus". Single-source claims flagged as "weak".
- Contradiction detection — Claim pairs analyzed for semantic conflicts using topic overlap + NLI contradiction classification. Detected contradictions surfaced in the response and penalize confidence.
- Multi-pass consistency — In thorough mode, claims are cross-checked across independent extraction passes. Claims confirmed by both passes get boosted; inconsistent claims are penalized (SelfCheckGPT-inspired).
- Auto-calibrated confidence — 7-factor confidence formula auto-adjusts from user feedback using isotonic calibration curves. Predicted confidence aligns with actual accuracy over time. Factors: verification rate (25%), domain authority (20%), source count (15%), consensus (15%), domain diversity (10%), claim grounding (10%), citation depth (5%).
- Per-claim evidence retrieval — Weak claims get targeted search queries generated by LLM, then searched individually across all providers. Each claim gets its own evidence pool instead of sharing the same corpus (SAFE-inspired, from Google DeepMind's fact-checking research).
- Counter-query verification — Verified claims are stress-tested with adversarial "what would disprove this?" search queries. If counter-evidence is found, claim confidence is penalized (SANCTUARY-inspired).
- Iterative confidence-gated retrieval — Thorough mode uses a FIRE-inspired loop: verify → if weak claims remain → generate
Tools (1)
searchPerforms real-time web search with evidence extraction and confidence scoring.Environment Variables
OPENROUTER_API_KEYrequiredAPI key for accessing LLM models used in verification and extraction.Configuration
{"mcpServers": {"browse-ai": {"command": "npx", "args": ["-y", "browseai-dev"]}}}