Search & ResearchDocumentedScanned

jina-reader

Web content extraction via Jina AI Reader API.

Share:

Installation

npx clawhub@latest install jina-reader

View the full skill documentation and source below.

Documentation

Jina Reader

Extract clean web content via Jina AI — without exposing your server IP.

Read a URL

{baseDir}/scripts/reader.sh ""

Search the web (top 5 results with full content)

{baseDir}/scripts/reader.sh --mode search "latest AI news 2025"

Fact-check a statement

{baseDir}/scripts/reader.sh --mode ground "OpenAI was founded in 2015"

Options

FlagDescriptionDefault
--moderead, search, groundread
--selectorCSS selector to extract specific region
--waitCSS selector to wait for before extraction
--removeCSS selectors to remove (comma-separated)
--proxyCountry code for geo-proxy (br, us, etc.)
--nocacheForce fresh content (skip cache)off
--formatmarkdown, html, text, screenshotmarkdown
--jsonRaw JSON outputoff

Examples

# Extract article content
{baseDir}/scripts/reader.sh ""

# Extract specific section via CSS selector
{baseDir}/scripts/reader.sh --selector "article.main" ""

# Remove nav and ads before extraction
{baseDir}/scripts/reader.sh --remove "nav,footer,.ads" ""

# Search with JSON output
{baseDir}/scripts/reader.sh --mode search --json "AI enterprise trends"

# Read via Brazil proxy
{baseDir}/scripts/reader.sh --proxy br ""

# Fact-check a claim
{baseDir}/scripts/reader.sh --mode ground "Tesla is the most valuable car company"

API Key

export JINA_API_KEY="jina_..."

Free tier: 10M tokens (no signup needed). Get key at

Pricing

  • Read: ~$0.005/page (standard) | 3x for ReaderLM-v2
  • Search: 10K tokens fixed + variable per result
  • Ground: ~300K tokens/request (~30s latency)

Why Jina Reader?

  • IP protection — requests route through Jina's infra, not your server
  • Clean markdown — readability extraction + optional ReaderLM-v2
  • Dynamic content — headless Chrome renders JavaScript
  • Structured extraction — JSON schema support for data extraction