Same intelligence. Up to 99% cheaper.
One API. Open-weight models. Pick your delivery window — Async or Overnight — and pay only for what you use.
Same Intelligence. Fraction of the price.
Cost to process 1 billion tokens in + 1 billion tokens out at comparable intelligence.
Intelligence via Artificial Analysis Index v4.0 · Hover any bar for full pricing details · Want access to a model you don't see here — just ask us!
No credit card required · No minimum spend · Pay only for tokens used
Start buildingThree speeds. One API.
Pick the delivery window that fits your workflow. All tiers use the same OpenAI-compatible API.
Dev Mode (Real-time)
Iterate on prompts with real-time responses. Full price, zero wait.
Async Inference
up to 50% off RTBackground agents that need results fast. Async delivery with SLA guarantee.
Overnight (~24 hours)
up to 75% off RTBig batch jobs where cost matters most. Deepest discounts.
Model-by-model breakdown
| Model | SLA | Input $/MTok | Output $/MTok | Cost / 1B in+out | vs Big Token | |
|---|---|---|---|---|---|---|
| Kimi-K2.6New | Async | $0.70 | $3.00 | $3.7K | 79% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.45 | $2.00 | $2.5K | 86% cheaper | |
| GLM-5.1-FP8New | Async | $1.05 | $3.30 | $4.3K | 76% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.70 | $2.20 | $2.9K | 84% cheaper | |
| Qwen3.5-397B-A17BNew | Async | $0.30 | $1.80 | $2.1K | 93% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.15 | $1.20 | $1.3K | 96% cheaper | |
| Qwen3.6-35B-A3B-FP8New | Async | $0.07 | $0.30 | $370 | 98% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.05 | $0.20 | $250 | 99% cheaper | |
| Qwen3.5-35B-A3B-FP8New | Async | $0.07 | $0.30 | $370 | 94% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.05 | $0.20 | $250 | 96% cheaper | |
| Qwen3.5-4BNew | Async | $0.05 | $0.08 | $130 | 99% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.04 | $0.06 | $100 | 99% cheaper | |
| Qwen3.5-9BNew | Async | $0.04 | $0.35 | $390 | 94% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.03 | $0.29 | $320 | 95% cheaper | |
| Gemma-4-31BNew | Async | $0.11 | $0.30 | $410 | 93% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.07 | $0.20 | $270 | 96% cheaper | |
| Nemotron-3-Super-120B-A12BNew | Async | $0.23 | $0.56 | $790 | 87% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.15 | $0.38 | $530 | 91% cheaper | |
| GPT-OSS-20B | Async | $0.03 | $0.20 | $230 | 98% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.02 | $0.15 | $170 | 98% cheaper | |
| Qwen3-VL-235B-A22B | Async | $0.15 | $0.55 | $700 | 69% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.10 | $0.40 | $500 | 78% cheaper | |
| Qwen3-VL-30B-A3B | Async | $0.07 | $0.30 | $370 | 75% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.05 | $0.20 | $250 | 83% cheaper | |
| Qwen3-14B-FP8 | Async | $0.03 | $0.30 | $330 | 78% cheaper | Try Model API |
| ↳ | Overnight (24H) | $0.02 | $0.20 | $220 | 85% cheaper | |
| DeepSeek-OCR-2OCR | Async | $0.08 | $0.08 | $160 | — | Try Model API |
| ↳ | Overnight (24H) | $0.05 | $0.05 | $100 | — | |
| olmOCR-2-7BOCR | Async | $0.15 | $0.15 | $300 | — | Try Model API |
| ↳ | Overnight (24H) | $0.10 | $0.10 | $200 | — | |
| LightOnOCR-2-1BOCR | Async | $0.08 | $0.08 | $160 | — | Try Model API |
| ↳ | Overnight (24H) | $0.05 | $0.05 | $100 | — | |
| Qwen3-Embedding-8B | Async | $0.03 | — | $30 | — | Try Model API |
| ↳ | Overnight (24H) | $0.02 | — | $20 | — |
No surprises. No lock-in.
Stop overpaying for inference.
Run your background agents and workloads at a fraction of the price and double the scale.
If you can wait an hour, you can save a lot.
