This is what batch inference should look like.
Traditional batch APIs save money but force you to manage .jsonl files, rewrite your code, and wait 24 hours. Doubleword gives you more savings with none of the complexity.
Why teams choose Doubleword over traditional batch APIs
Cheaper Inference
Purpose-built for throughput, not latency. Up to 50% cheaper than the cheapest batch API and up to 90% cheaper than Anthropic Batch — same intelligence class, verified by independent benchmarks.
Completion Times That Fit Your Use Case
Choose from async or 24-hour completion times appropriate for your use case. Run complex, chained agent trees and synthetic data pipelines in a single afternoon — without paying real-time premiums.
The Autobatcher: Zero-Refactor Savings
Don't rip out your standard async calls. Our autobatcher acts as a drop-in replacement that transparently collects and batches requests over a configurable window. Capture batch-tier pricing instantly — no polling logic or manual parsing.
DevEx that shows Batch Isn't an Afterthought
Streaming results as they complete — no waiting for the whole batch to finish. Mixed request routing across chat completions, embeddings, and JSON-mode in a single window. Webhooks to plug directly into your existing pipelines and event-driven architectures.
The cost difference is massive
Output price per 1M tokens · Qwen3.5-397B class models · Batch tiers
Anthropic Batch
Claude Opus 4.5 · 24H
OpenAI Batch
GPT-5.2 · 24H
Doubleword
Qwen3.5-397B · Async
Doubleword
Qwen3.5-397B · Overnight (24h)
Prices from Artificial Analysis · Comparable intelligence-class models · Batch tier pricing
Traditional batch APIs vs Doubleword
Built for your highest-volume workloads
Great for
Ready to simplify your batch inference?
Start sending requests in under 5 minutes. No .jsonl files, no polling, no rewrites.
Get started — Free