GPT-OSS-20B

Powerful chain-of-thought reasoning in an efficient, highly scalable 20B parameter model.

Get API Key Test in Playground

Total Parameters

20B

Context Window

128K

Tokens

Infrastructure

Single

B200 GPU

License

Apache

2.0

About

Scalable Open Reasoning

GPT-OSS-20B provides powerful chain-of-thought reasoning in an incredibly efficient 20B parameter model. Designed for single-GPU deployment while maintaining sophisticated reasoning capabilities, this Apache 2.0 licensed model offers the perfect balance of performance and resource efficiency. It utilizes a compact Mixture-of-Experts (MoE) design with SwiGLU activations, native FP4 quantization for optimal inference speed, and an adjustable reasoning effort level for task-specific optimization.

Compact MoE — Single-GPU Ready

Use Cases

Built for efficient reasoning

Development Applications

Accelerate engineering with rapid prototyping, robust code generation, automated API design, and system integration testing.

Business Solutions

Streamline enterprise operations through customer support automation, advanced content generation, and sophisticated market research analysis.

Educational Use Cases

Deploy interactive tutoring systems, support curriculum development, and assist with academic writing and research methodology guidance.

Edge & Scalable Deployment

Leverage cost-effective single-GPU operations for reduced infrastructure requirements, making it ideal for edge computing and distributed processing.

Benchmarks

Efficient Intelligence

Proven performance across reasoning, coding, and agentic workflows for the 20B weight class.

Intelligence Index

Better than 58% of models

GPQA Diamond

Better than 64% of models

τ²-Bench Telecom

Better than 50% of models

Category	Benchmark	Score	Description
Reasoning	GPQA Diamond	73%	Graduate-level scientific reasoning
Reasoning	Humanity's Last Exam	10%	Humanity's Last Exam
Reasoning	τ²-Bench Telecom	44%	AI agents in dual-control scenarios
Reasoning	AA-LCR	31%	Long context reasoning evaluation
Reasoning	IFBench	47%	Instruction-following accuracy
Coding	SciCode	28%	Python for scientific computing
Coding	Terminal-Bench Hard	11%	Agentic coding & terminal use
Knowledge	AA-Omniscience Accuracy	16%	Proportion of correctly answered questions

Metrics sourced from Artificial Analysis.

Pricing

Flexible Pricing Tiers

Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

Tier	Input / 1M tokens	Output / 1M tokens
Standard	$0.02	$0.15
Async	$0.03	$0.20
Realtime	$0.04	$0.30

Context window natively supported up to 128k tokens.

Quickstart

Start Building in Minutes

GPT-OSS-20B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key-here",
    base_url="https://api.doubleword.ai/v1"
)

# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
    batch_file = client.files.create(
        file=file,
        purpose="batch"
    )

print(f"File ID: {batch_file.id}")

# Step 2: Create a batch job
batch = client.batches.create(
    input_file_id=batch_file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

print(f"Batch ID: {batch.id}")

# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")

Ready to deploy GPT-OSS-20B?

Get Your API Keys Read the Full Documentation