web-search-plus
Unified search skill with Intelligent Auto-Routing.
Installation
npx clawhub@latest install web-search-plusView the full skill documentation and source below.
Documentation
Web Search Plus
Multi-provider web search with Intelligent Auto-Routing: Serper (Google), Tavily (Research), Exa (Neural), You.com (RAG/Real-time), SearXNG (Privacy/Self-hosted).
NEW in v2.5.0: π SearXNG provider added! Privacy-first meta-search, 70+ upstream engines, $0 API cost, self-hosted β perfect for privacy-conscious users and multi-source aggregation.
NEW in v2.3.0: Interactive setup wizard! Run python3 scripts/setup.py for guided configuration.
NEW in v2.2.5: Automatic error fallback β if one provider fails (rate limit, timeout, etc.), automatically tries the next provider in priority order!
π First Run (Setup Wizard)
New to web-search-plus? The interactive setup wizard guides you through configuration:
python3 scripts/setup.py
The wizard will:
config.json (gitignored)What Each Provider Is Best For
| Provider | Best For | Free Tier |
| Serper | Google results, shopping, prices, local businesses, news | 2,500/month |
| Tavily | Research questions, explanations, academic, full-page content | 1,000/month |
| Exa | Semantic search, "similar to X", startup discovery, papers | 1,000/month |
| You.com | RAG/AI context, real-time info, combined web+news, LLM-ready snippets | Limited free |
| SearXNG | Privacy-preserving search, multi-source aggregation, $0 API cost | FREE (self-hosted) |
Reconfigure Anytime
python3 scripts/setup.py --reset
π API Keys Setup (Manual)
NEW in v2.2.0: The script auto-loads API keys from .env in the skill directory!
Quick Setup
Option A: .env file (recommended)
# /path/to/skills/web-search-plus/.env
export SERPER_API_KEY="your-key" #
export TAVILY_API_KEY="your-key" #
export EXA_API_KEY="your-key" #
export YOU_API_KEY="your-key" #
export SEARXNG_INSTANCE_URL="" # Self-hosted
Option B: config.json (NEW in v2.2.1)
# Copy the example config
cp config.example.json config.jsonThen add your keys:
{
"serper": { "api_key": "your-serper-key" },
"tavily": { "api_key": "your-tavily-key" },
"exa": { "api_key": "your-exa-key" },
"you": { "api_key": "your-you-key" },
"searxng": { "instance_url": "" }
}β οΈ
config.json is gitignored β your keys stay safe!
Just run β keys load automatically:
python3 scripts/search.py -q "your query"
# No need for 'source .env' anymore! β¨
Priority: config.json > .env > environment variable
Get Free API Keys
| Provider | Free Tier | Sign Up |
| Serper | 2,500 queries/mo | |
| Tavily | 1,000 queries/mo | |
| Exa | 1,000 queries/mo | |
| You.com | Limited free tier | |
| SearXNG | Unlimited (self-hosted) |
β οΈ Don't Modify Core OpenClaw Config
Tavily, Serper, and Exa are NOT core OpenClaw providers.
β DON'T add to ~/.openclaw/openclaw.json:
"tools": { "web": { "search": { "provider": "tavily" }}} // WRONG!
β
DO use this skill's scripts β keys auto-load from .env
Core OpenClaw only supports brave as the built-in web search provider. This skill adds Serper, Tavily, and Exa as additional options via its own scripts.
π§ Intelligent Auto-Routing
No need to choose a provider β just search! The skill uses multi-signal analysis to understand your query intent:
# These queries are intelligently routed with confidence scoring:
python3 scripts/search.py -q "how much does iPhone 16 cost" # β Serper (68% MEDIUM)
python3 scripts/search.py -q "how does quantum entanglement work" # β Tavily (86% HIGH)
python3 scripts/search.py -q "startups similar to Notion" # β Exa (76% HIGH)
python3 scripts/search.py -q "MacBook Pro M3 specs review" # β Serper (70% HIGH)
python3 scripts/search.py -q "explain pros and cons of React" # β Tavily (85% HIGH)
python3 scripts/search.py -q "companies like stripe.com" # β Exa (100% HIGH)
python3 scripts/search.py -q "what's the latest on AI regulation" # β You.com (72% HIGH)
python3 scripts/search.py -q "summarize current events in tech" # β You.com (78% HIGH)
python3 scripts/search.py -q "search privately without tracking" # β SearXNG (90% HIGH)
python3 scripts/search.py -q "results from multiple search engines" # β SearXNG (80% HIGH)
How It Works
The routing engine analyzes multiple signals:
π Shopping Intent β Serper
| Signal Type | Examples | Weight |
| Price patterns | "how much", "price of", "cost of" | HIGH |
| Purchase intent | "buy", "purchase", "order", "where to buy" | HIGH |
| Deal signals | "deal", "discount", "cheap", "best price" | MEDIUM |
| Product + Brand | "iPhone 16", "Sony headphones" + specs/review | HIGH |
| Local business | "near me", "restaurants", "hotels" | HIGH |
π Research Intent β Tavily
| Signal Type | Examples | Weight |
| Explanation | "how does", "why does", "explain", "what is" | HIGH |
| Analysis | "compare", "pros and cons", "difference between" | HIGH |
| Learning | "tutorial", "guide", "understand", "learn" | MEDIUM |
| Depth | "in-depth", "comprehensive", "detailed" | MEDIUM |
| Complex queries | Long, multi-clause questions | BONUS |
π Discovery Intent β Exa
| Signal Type | Examples | Weight |
| Similarity | "similar to", "alternatives to", "competitors" | VERY HIGH |
| Company discovery | "companies like", "startups doing", "who else" | HIGH |
| URL detection | Any URL or domain (stripe.com) | VERY HIGH |
| Academic | "arxiv", "research papers", "github projects" | HIGH |
| Funding | "Series A", "YC", "funded startup" | HIGH |
π€ RAG/Real-time Intent β You.com
| Signal Type | Examples | Weight |
| RAG context | "context for", "rag", "summarize", "tldr" | VERY HIGH |
| Information synthesis | "key points", "main takeaways", "quick overview" | HIGH |
| Real-time needs | "latest news", "current status", "right now" | HIGH |
| Combined sources | "what's happening", "updates on", "situation" | HIGH |
| Freshness | "as of today", "up to date", "real time" | HIGH |
π Privacy/Multi-Source Intent β SearXNG
| Signal Type | Examples | Weight |
| Privacy signals | "private", "anonymous", "without tracking", "no tracking" | VERY HIGH |
| Privacy-focused | "privacy-first", "duckduckgo alternative", "private search" | VERY HIGH |
| Multi-source | "aggregate results", "multiple sources", "diverse perspectives" | HIGH |
| Meta-search | "meta search", "all engines", "from multiple engines" | VERY HIGH |
| Budget/free | "free search", "no api cost", "self-hosted search", "zero cost" | HIGH |
| German | "privat", "anonym", "ohne tracking", "verschiedene quellen" | HIGH |
Confidence Scoring
Every routing decision includes a confidence level:
| Confidence | Level | Meaning |
| 70-100% | HIGH | Strong signal match, very reliable |
| 40-69% | MEDIUM | Good match, should work well |
| 0-39% | LOW | Ambiguous query, using fallback |
Debug Routing Decisions
See the full analysis:
python3 scripts/search.py --explain-routing -q "how much does iPhone 16 Pro cost"
Output:
{
"query": "how much does iPhone 16 Pro cost",
"routing_decision": {
"provider": "serper",
"confidence": 0.68,
"confidence_level": "medium",
"reason": "moderate_confidence_match"
},
"scores": {"serper": 7.0, "tavily": 0.0, "exa": 0.0},
"top_signals": [
{"matched": "how much", "weight": 4.0},
{"matched": "brand + product detected", "weight": 3.0}
],
"query_analysis": {
"word_count": 7,
"is_complex": false,
"has_url": null,
"recency_focused": false
}
}
π When to Use This Skill vs Built-in Brave Search
Use Built-in Brave Search when:
- β General web searches (news, info, questions)
- β Privacy is important
- β Quick lookups without specific requirements
Use web-search-plus when:
β Serper (Google results):
- ποΈ Product specs, prices, shopping - "Compare iPhone 16 vs Samsung S24"
- π Local businesses, places - "Best pizza in Berlin"
- π― "Google it" - Explicitly wants Google results
- π° Shopping/images/news -
--type shopping/images/news - π Knowledge Graph - Structured info (prices, ratings, etc.)
β Tavily (AI-optimized research):
- π Research questions - "How does quantum computing work?"
- π¬ Deep dives - Complex multi-part questions
- π Full page content - Not just snippets (
--raw-content) - π Academic research - Synthesized answers
- π Domain filtering -
--include-domainsfor trusted sources
β Exa (Neural semantic search):
- π Similar pages - "Sites like OpenAI.com" (
--similar-url) - π’ Company discovery - "AI companies like Anthropic"
- π Research papers -
--category "research paper" - π» GitHub projects -
--category github - π
Date-specific -
--start-date/--end-date
β You.com (RAG/Real-time):
- π€ RAG applications - Pre-extracted LLM-ready snippets
- π° Combined web + news - Single API call for both
- β‘ Real-time info - Current events, status updates
- π Summarization context - "What's the latest on..."
- π Live crawling - Full page content on demand (
--livecrawl)
β SearXNG (Privacy-First/Self-Hosted):
- π Privacy-preserving search - No tracking, no profiling
- π Multi-source aggregation - 70+ upstream engines
- π° $0 API cost - Self-hosted, unlimited queries
- π― Diverse perspectives - Results from multiple engines
- π Self-hosted environments - Full control over your search
Provider Comparison
| Feature | Serper | Tavily | Exa | You.com | SearXNG |
| --------- | :------: | :------: | :---: | :-------: | :-------: |
| Speed | β‘β‘β‘ | β‘β‘ | β‘β‘ | β‘β‘β‘ | β‘β‘ |
| Factual Accuracy | βββ | βββ | ββ | βββ | βββ |
| Semantic Understanding | β | ββ | βββ | ββ | β |
| Research Quality | ββ | βββ | ββ | ββ | ββ |
| Full Page Content | β | β | β | β | β |
| Shopping/Local | β | β | β | β | β |
| Similar Pages | β | β | β | β | β |
| Knowledge Graph | β | β | β | β | β |
| News Integration | β | β | β | β | β |
| RAG-Optimized | β | β | β | ββ | β |
| Privacy-First | β | β | β | β | ββ |
| Self-Hosted | β | β | β | β | β |
| API Cost | $$ | $$ | $$ | $ | FREE |
Usage Examples
Auto-Routed (Recommended)
python3 scripts/search.py -q "iPhone 16 Pro Max price" # β Serper
python3 scripts/search.py -q "how does HTTPS encryption work" # β Tavily
python3 scripts/search.py -q "startups similar to Notion" # β Exa
python3 scripts/search.py -q "latest updates on AI regulation" # β You.com
python3 scripts/search.py -q "search privately without tracking" # β SearXNG
Explicit Provider
python3 scripts/search.py -p serper -q "weather Berlin" --type weather
python3 scripts/search.py -p tavily -q "quantum computing" --depth advanced
python3 scripts/search.py -p exa --similar-url "" --category company
python3 scripts/search.py -p you -q "current tech news" --include-news
python3 scripts/search.py -p you -q "climate change" --livecrawl all --freshness week
python3 scripts/search.py -p searxng -q "linux distros" --engines "google,bing,duckduckgo"
Configuration
config.json
{
"auto_routing": {
"enabled": true,
"fallback_provider": "serper",
"confidence_threshold": 0.3,
"disabled_providers": []
},
"serper": {"country": "us", "language": "en"},
"tavily": {"depth": "advanced"},
"exa": {"type": "neural"},
"you": {"country": "US", "safesearch": "moderate", "include_news": true},
"searxng": {"instance_url": "", "safesearch": 0}
}
Output Format
{
"provider": "serper",
"query": "iPhone 16 price",
"results": [{"title": "...", "url": "...", "snippet": "...", "score": 0.95}],
"answer": "Synthesized answer...",
"routing": {
"auto_routed": true,
"provider": "serper",
"confidence": 0.78,
"confidence_level": "high",
"reason": "high_confidence_match",
"top_signals": [{"matched": "price", "weight": 3.0}]
}
}
FAQ
General
Q: How does auto-routing decide which provider to use?
Multi-signal analysis scores each provider based on: price patterns, explanation phrases, similarity keywords, URLs, product+brand combos, and query complexity. Highest score wins. Use --explain-routing to see the decision breakdown.
Q: What if it picks the wrong provider?
Override with-p serper/tavily/exa. Check--explain-routingto understand why it chose differently.
Q: What does "low confidence" mean?
Query is ambiguous (e.g., "Tesla" could be cars, stock, or company). Falls back to Serper. Results may vary.
Q: Can I disable a provider?
Yes! In config.json: "disabled_providers": ["exa"]
API Keys
Q: Which API keys do I need?
At minimum ONE key (or SearXNG instance). You can use just Serper, just Tavily, just Exa, just You.com, or just SearXNG. Missing keys = that provider is skipped.
Q: Where do I get API keys?
- Serper: (2,500 free queries, no credit card)
- Tavily: (1,000 free searches/month)
- Exa: (1,000 free searches/month)
- You.com: (Limited free tier for testing)
- SearXNG: Self-hosted, no key needed!
Q: How do I set API keys?
Two options (both auto-load):
Option A: .env file
> export SERPER_API_KEY="your-key" >
Option B: config.json (v2.2.1+)
> { "serper": { "api_key": "your-key" } } >
Routing Details
Q: How do I know which provider handled my search?
Checkrouting.providerin JSON output, or[π Searched with: Provider]in chat responses.
Q: Why does it sometimes choose Serper for research questions?
If the query has brand/product signals (e.g., "how does Tesla FSD work"), shopping intent may outweigh research intent. Override with -p tavily.
Q: What's the confidence threshold?
Default: 0.3 (30%). Below this = low confidence, uses fallback. Adjustable in config.json.
You.com Specific
Q: When should I use You.com over other providers?
You.com excels at:
- RAG applications: Pre-extracted snippets ready for LLM consumption
- Real-time information: Current events, breaking news, status updates
- Combined sources: Web + news results in a single API call
- Summarization tasks: "What's the latest on...", "Key points about..."
Q: What's the livecrawl feature?
You.com can fetch full page content on-demand. Use--livecrawl webfor web results,--livecrawl newsfor news articles, or--livecrawl allfor both. Content is returned in Markdown format.
Q: Does You.com include news automatically?
Yes! You.com's intelligent classification automatically includes relevant news results when your query has news intent. You can also use --include-news to explicitly enable it.
SearXNG Specific
Q: Do I need my own SearXNG instance?
Yes! SearXNG is self-hosted. Most public instances disable the JSON API to prevent bot abuse. You need to run your own instance with JSON format enabled. See:
Q: How do I set up SearXNG?
Docker is the easiest way:
> docker run -d -p 8080:8080 searxng/searxng >
Then enable JSON in settings.yml:> search: > formats: > - html > - json >
Q: Why am I getting "403 Forbidden"?
The JSON API is disabled on your instance. Enable it insettings.ymlundersearch.formats.
Q: What's the API cost for SearXNG?
$0! SearXNG is free and open-source. You only pay for hosting (~$5/month VPS). Unlimited queries.
Q: When should I use SearXNG?
- Privacy-sensitive queries: No tracking, no profiling
- Budget-conscious: $0 API cost
- Diverse results: Aggregates 70+ search engines
- Self-hosted requirements: Full control over your search infrastructure
- Fallback provider: When paid APIs are rate-limited
Q: Can I limit which search engines SearXNG uses?
Yes! Use--engines google,bing,duckduckgoto specify engines, or configure defaults inconfig.json.
Troubleshooting
Q: "No API key found" error?
1. Check.envexists in skill folder withexport VAR=valueformat
2. Keys auto-load from skill's .env since v2.2.03. Or set in system environment: export SERPER_API_KEY="..."
Q: Getting empty results?
1. Check API key is valid
2. Try a different provider with -p3. Some queries have no results (very niche topics)
Q: Rate limited?
NEW in v2.2.5: Automatic fallback! If one provider hits rate limits, the script automatically tries the next provider in priority order (serper β tavily β exa). You'll see fallback info in stderr and the response will include routing.fallback_used: true.Provider limits: Serper 2,500 free total, Tavily 1,000/month free, Exa 1,000/month free.
For OpenClaw Users
Q: How do I use this in chat?
Just ask! OpenClaw auto-detects search intent. Or explicitly: "search with web-search-plus for..."
Q: Does it replace built-in Brave Search?
No, it's complementary. Use Brave for quick lookups, web-search-plus for research/shopping/discovery.
Q: Can I see which provider was used?
Yes! SOUL.md can include attribution: [π Searched with: Serper/Tavily/Exa]
β Frequently Asked Questions
Which provider should I use?
It depends on your query intent:
| Query Type | Best Provider | Why |
| Shopping ("buy laptop", "cheap shoes") | Serper | Google Shopping, price comparisons, local stores |
| Research ("how does X work?", "explain Y") | Tavily | Deep research, academic quality, full-page content |
| Startups/Papers ("companies like X", "arxiv papers") | Exa | Semantic/neural search, startup discovery |
| RAG/Real-time ("summarize latest", "current events") | You.com | LLM-ready snippets, combined web+news |
| Privacy ("search without tracking") | SearXNG | No tracking, multi-source, self-hosted |
Do I need all 5 providers?
No! All providers are optional. You can use:
- 1 provider (e.g., just Serper for everything)
- 2-3 providers (e.g., Serper + You.com for most needs)
- All 5 (maximum flexibility + fallback options)
Set
disabled_providers in config.json to control which ones are active.
How much do the APIs cost?
| Provider | Free Tier | Paid Plan |
| Serper | 2,500 queries/mo | $50/mo (5,000 queries) |
| Tavily | 1,000 queries/mo | $150/mo (10,000 queries) |
| Exa | 1,000 queries/mo | $1,000/mo (100,000 queries) |
| You.com | Limited free | ~$10/mo (varies by usage) |
| SearXNG | FREE β | Only VPS cost (~$5/mo if self-hosting) |
How do I set up SearXNG?
SearXNG requires a self-hosted instance. Quick setup:
Option 1: Docker (5 minutes)
docker run -d -p 8080:8080 searxng/searxng
Option 2: Full installation
See:
Enable JSON format:
Edit settings.yml:
search:
formats: [html, json] # Add 'json'!
Then in web-search-plus:
export SEARXNG_INSTANCE_URL=""
python3 scripts/setup.py # Or edit config.json directly
Can I use public SearXNG instances?
Not recommended. Most public instances (searx.space) disable JSON API to prevent abuse. You'll get 403 Forbidden errors.
Solution: Self-host on a VPS (~$5/mo) or use Docker locally.
Is SearXNG slower than paid APIs?
Yes, typically. SearXNG queries multiple engines in parallel:
- SearXNG: 1-3 seconds (aggregates 70+ engines)
- Paid APIs: 0.5-1 second (direct access)
Trade-off: Slower but privacy-preserving + multi-source + $0 cost.
What if my SearXNG instance is down?
Auto-routing automatically falls back to the next provider in provider_priority. If Searx NG is first:
"provider_priority": ["searxng", "serper", "tavily", "exa", "you"]β Failed SearXNG request β tries Serper β tries Tavily β etc.
Tip: Put SearXNG last in priority for best reliability (use paid APIs first, SearXNG as fallback).
Can I limit which engines SearXNG uses?
Yes! Set engines in config.json:
{
"searxng": {
"engines": ["google", "duckduckgo", "qwant", "wikipedia"]
}
}
Or via CLI:
python3 scripts/search.py -p searxng -q "linux" --engines "google,bing"
How private is SearXNG really?
Depends on your instance!
| Setup | Privacy Level |
| Self-hosted (your VPS) | βββββ You control everything |
| Self-hosted (Docker local) | βββββ Fully private |
| Public instance | βββ Depends on operator's logging policy |
Which provider has the best results?
There's no "best" β it depends!
| Metric | Winner |
| Most accurate for facts | Serper (Google) |
| Best for research depth | Tavily |
| Best for semantic queries | Exa |
| Best for RAG/AI context | You.com |
| Most diverse sources | SearXNG (70+ engines) |
| Most private | SearXNG (self-hosted) |
How does auto-routing work?
The skill analyzes your query for keywords and patterns:
"buy cheap laptop" β Serper (shopping signals)
"how does AI work?" β Tavily (research/explanation)
"companies like X" β Exa (semantic/similar)
"summarize latest news" β You.com (RAG/real-time)
"search privately" β SearXNG (privacy signals)
Confidence threshold: Only routes if confidence > 30%. Otherwise uses default provider.
Override: Use -p provider to force a specific provider.
Can I use this in production?
Yes! Web-search-plus is production-ready:
- β Error handling with automatic fallback
- β Rate limit protection
- β Timeout handling (30s per provider)
- β API key security (.env + config.json gitignored)
- β 5 providers for redundancy
Tip: Monitor API usage to avoid exceeding free tiers!
What if I run out of API credits?
disabled_providers to skip exhausted APIs temporarilyHow do I update to v2.5.0?
Via ClawHub (recommended):
clawhub update web-search-plus --registry "" --no-input
Manually:
cd /root/clawd/skills/web-search-plus
git pull origin main
python3 scripts/setup.py # Re-run to add SearXNG
Where can I report bugs or request features?
- GitHub Issues:
- ClawHub:
Still have questions? Check the full documentation in README.md or run:
python3 scripts/search.py --help
python3 scripts/setup.py # Interactive wizard