GPT-OSS-20B
Powerful chain-of-thought reasoning in an efficient, highly scalable 20B parameter model.
Total Parameters
20B
Context Window
128K
Tokens
Infrastructure
Single
B200 GPU
License
Apache
2.0
Scalable Open Reasoning
GPT-OSS-20B provides powerful chain-of-thought reasoning in an incredibly efficient 20B parameter model. Designed for single-GPU deployment while maintaining sophisticated reasoning capabilities, this Apache 2.0 licensed model offers the perfect balance of performance and resource efficiency. It utilizes a compact Mixture-of-Experts (MoE) design with SwiGLU activations, native FP4 quantization for optimal inference speed, and an adjustable reasoning effort level for task-specific optimization.
Compact MoE — Single-GPU Ready
Built for efficient reasoning
Development Applications
Accelerate engineering with rapid prototyping, robust code generation, automated API design, and system integration testing.
Business Solutions
Streamline enterprise operations through customer support automation, advanced content generation, and sophisticated market research analysis.
Educational Use Cases
Deploy interactive tutoring systems, support curriculum development, and assist with academic writing and research methodology guidance.
Edge & Scalable Deployment
Leverage cost-effective single-GPU operations for reduced infrastructure requirements, making it ideal for edge computing and distributed processing.
Efficient Intelligence
Proven performance across reasoning, coding, and agentic workflows for the 20B weight class.
Overall Intelligence
Better than 58% of models
Coding Capability
Better than 56% of models
Agentic Capability
Better than 61% of models
| Category | Benchmark | Score |
|---|---|---|
| Reasoning | GPQA Diamond | 68.8% |
| Reasoning | τ²-Bench Telecom | 60.2% |
| Reasoning | IFBench | 65.1% |
| Reasoning | AA-LCR | 30.7% |
| Reasoning | GDPval-AA | 9.2% |
| Reasoning | HLE | 9.8% |
| Reasoning | CritPt | 1.4% |
| Coding | SciCode | 34.4% |
| Coding | Terminal-Bench Hard | 10.6% |
| Knowledge | AA-Omniscience | 15.5% |
Metrics sourced from Artificial Analysis.
Flexible Pricing Tiers
Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.
| Tier | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Standard | $0.02 | $0.15 |
| Async | $0.03 | $0.20 |
| Realtime | $0.04 | $0.30 |
Context window natively supported up to 128k tokens.
Start Building in Minutes
GPT-OSS-20B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.
from openai import OpenAI
client = OpenAI(
api_key="your-api-key-here",
base_url="https://api.doubleword.ai/v1"
)
# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
batch_file = client.files.create(
file=file,
purpose="batch"
)
print(f"File ID: {batch_file.id}")
# Step 2: Create a batch job
batch = client.batches.create(
input_file_id=batch_file.id,
endpoint="/v1/chat/completions",
completion_window="24h"
)
print(f"Batch ID: {batch.id}")
# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")