Doubleword
    FOR BATCH WORKLOADS

    This is what batch inference should look like.

    Traditional batch APIs save money but force you to manage .jsonl files, rewrite your code, and wait 24 hours. Doubleword gives you more savings with none of the complexity.

    Why teams choose Doubleword over traditional batch APIs

    Cheaper Inference

    Purpose-built for throughput, not latency. Up to 50% cheaper than the cheapest batch API and up to 90% cheaper than Anthropic Batch — same intelligence class, verified by independent benchmarks.

    Completion Times That Fit Your Use Case

    Choose from async or 24-hour completion times appropriate for your use case. Run complex, chained agent trees and synthetic data pipelines in a single afternoon — without paying real-time premiums.

    The Autobatcher: Zero-Refactor Savings

    Don't rip out your standard async calls. Our autobatcher acts as a drop-in replacement that transparently collects and batches requests over a configurable window. Capture batch-tier pricing instantly — no polling logic or manual parsing.

    DevEx that shows Batch Isn't an Afterthought

    Streaming results as they complete — no waiting for the whole batch to finish. Mixed request routing across chat completions, embeddings, and JSON-mode in a single window. Webhooks to plug directly into your existing pipelines and event-driven architectures.

    The cost difference is massive

    Output price per 1M tokens · Qwen3.5-397B class models · Batch tiers

    Anthropic Batch

    Claude Opus 4.5 · 24H

    $12.50

    OpenAI Batch

    GPT-5.2 · 24H

    $7.00

    Doubleword

    Qwen3.5-397B · Async

    $1.80
    up to 86% less

    Doubleword

    Qwen3.5-397B · Overnight (24h)

    $1.20
    up to 90% less

    Prices from Artificial Analysis · Comparable intelligence-class models · Batch tier pricing

    Traditional batch APIs vs Doubleword

    Traditional Batch APIs
    Doubleword
    Getting started
    Write .jsonl boilerplate, manage uploads
    Drop-in autobatcher — zero refactor
    Cost savings
    50% off. Better, but not great.
    Up to 97% off. Actually cheap.
    Speed
    24 hours. Hope it doesn't expire.
    Choose your speed: Async or Overnight
    When something fails
    Silent failures. Expired batches.
    Guaranteed delivery or your money back.
    Getting results
    Wait for everything to finish, then download
    Streamed back as they complete
    Plugging into your stack
    Rewrite your app around their batch format
    Webhooks, streaming, OpenAI-compatible

    Built for your highest-volume workloads

    Great for

    Data processing pipelinesDocument analysisSynthetic data generationModel evaluationsImage processingEmbeddings

    Ready to simplify your batch inference?

    Start sending requests in under 5 minutes. No .jsonl files, no polling, no rewrites.

    Get started — Free