AI & LLMsDocumentedScanned

doubleword-api

Create and manage batch inference jobs using the Doubleword API (api.doubleword.ai).

Share:

Installation

npx clawhub@latest install doubleword-api

View the full skill documentation and source below.

Documentation

Doubleword Batch Inference

Process multiple AI inference requests asynchronously using the Doubleword batch API.

When to Use Batches

Batches are ideal for:

  • Multiple independent requests that can run simultaneously

  • Workloads that don't require immediate responses

  • Large volumes that would exceed rate limits if sent individually

  • Cost-sensitive workloads (24h window offers better pricing)


Quick Start

Basic workflow for any batch job:

  • Create JSONL file with requests (one JSON object per line)

  • Upload file to get file ID

  • Create batch using file ID

  • Poll status until complete

  • Download results from output_file_id
  • Workflow

    Step 1: Create Batch Request File

    Create a .jsonl file where each line contains a single request:

    {"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "What is 2+2?"}]}}
    {"custom_id": "req-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "What is the capital of France?"}]}}

    Required fields per line:

    • custom_id: Unique identifier (max 64 chars) - use descriptive IDs like "user-123-question-5" for easier result mapping

    • method: Always "POST"

    • url: Always "/v1/chat/completions"

    • body: Standard API request with model and messages


    Optional body parameters:
    • temperature: 0-2 (default: 1.0)

    • max_tokens: Maximum response tokens

    • top_p: Nucleus sampling parameter

    • stop: Stop sequences


    File limits:
    • Max size: 200MB

    • Format: JSONL only (JSON Lines - newline-delimited JSON)

    • Split large batches into multiple files if needed


    Helper script:
    Use scripts/create_batch_file.py to generate JSONL files programmatically:

    python scripts/create_batch_file.py output.jsonl

    Modify the script's requests list to generate your specific batch requests.

    Step 2: Upload File

    Upload the JSONL file:

    curl  \
      -H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
      -F purpose="batch" \
      -F file="@batch_requests.jsonl"

    Response contains id field - save this file ID for next step.

    Step 3: Create Batch

    Create the batch job using the file ID:

    curl  \
      -H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "input_file_id": "file-abc123",
        "endpoint": "/v1/chat/completions",
        "completion_window": "24h"
      }'

    Parameters:

    • input_file_id: File ID from upload step

    • endpoint: Always "/v1/chat/completions"

    • completion_window: Choose "24h" (better pricing) or "1h" (50% premium, faster results)


    Response contains batch id - save this for status polling.

    Step 4: Poll Status

    Check batch progress:

    curl  \
      -H "Authorization: Bearer $DOUBLEWORD_API_KEY"

    Status progression:

  • validating - Checking input file format

  • in_progress - Processing requests

  • completed - All requests finished
  • Other statuses:

    • failed - Batch failed (check error_file_id)

    • expired - Batch timed out

    • cancelling/cancelled - Batch cancelled


    Response includes:
    • output_file_id - Download results here

    • error_file_id - Failed requests (if any)

    • request_counts - Total/completed/failed counts


    Polling frequency: Check every 30-60 seconds during processing.

    Early access: Results available via output_file_id before batch fully completes - check X-Incomplete header.

    Step 5: Download Results

    Download completed results:

    curl  \
      -H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
      > results.jsonl

    Response headers:

    • X-Incomplete: true - Batch still processing, more results coming

    • X-Last-Line: 45 - Resume point for partial downloads


    Output format (each line):
    {
      "id": "batch-req-abc",
      "custom_id": "request-1",
      "response": {
        "status_code": 200,
        "body": {
          "id": "chatcmpl-xyz",
          "choices": [{
            "message": {
              "role": "assistant",
              "content": "The answer is 4."
            }
          }]
        }
      }
    }

    Download errors (if any):

    curl  \
      -H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
      > errors.jsonl

    Error format (each line):

    {
      "id": "batch-req-def",
      "custom_id": "request-2",
      "error": {
        "code": "invalid_request",
        "message": "Missing required parameter"
      }
    }

    Additional Operations

    List All Batches

    curl  \
      -H "Authorization: Bearer $DOUBLEWORD_API_KEY"

    Cancel Batch

    curl  \
      -X POST \
      -H "Authorization: Bearer $DOUBLEWORD_API_KEY"

    Notes:

    • Unprocessed requests are cancelled

    • Already-processed results remain downloadable

    • Cannot cancel completed batches


    Common Patterns

    Processing Results

    Parse JSONL output line-by-line:

    import json
    
    with open('results.jsonl') as f:
        for line in f:
            result = json.loads(line)
            custom_id = result['custom_id']
            content = result['response']['body']['choices'][0]['message']['content']
            print(f"{custom_id}: {content}")

    Handling Partial Results

    Check for incomplete batches and resume:

    import requests
    
    response = requests.get(
        '',
        headers={'Authorization': f'Bearer {api_key}'}
    )
    
    if response.headers.get('X-Incomplete') == 'true':
        last_line = int(response.headers.get('X-Last-Line', 0))
        print(f"Batch incomplete. Processed {last_line} requests so far.")
        # Continue polling and download again later

    Retry Failed Requests

    Extract failed requests from error file and resubmit:

    import json
    
    failed_ids = []
    with open('errors.jsonl') as f:
        for line in f:
            error = json.loads(line)
            failed_ids.append(error['custom_id'])
    
    print(f"Failed requests: {failed_ids}")
    # Create new batch with only failed requests

    Best Practices

  • Descriptive custom_ids: Include context in IDs for easier result mapping

  • - Good: "user-123-question-5"
    - Bad: "1", "req1"

  • Validate JSONL locally: Ensure each line is valid JSON before upload
  • Split large files: Keep under 200MB limit
  • Choose appropriate window: Use 24h for cost savings, 1h only when time-sensitive
  • Handle errors gracefully: Always check error_file_id and retry failed requests
  • Monitor request_counts: Track progress via completed/total ratio
  • Save file IDs: Store batch_id, input_file_id, output_file_id for later retrieval
  • Reference Documentation

    For complete API details including authentication, rate limits, and advanced parameters, see:

    • API Reference: references/api_reference.md - Full endpoint documentation and schemas