Gemma 4 31B
Google DeepMind's most capable open model — advanced reasoning, coding, and multimodal understanding with native function calling and 256K context.
Total Parameters
31B
Dense
Context Window
256K
Tokens
Intelligence
39
AA Index v4.0
License
Apache 2.0
Open Weights
Google's Most Capable Open Model
Gemma 4 31B is Google DeepMind's most capable open model, built for advanced reasoning, coding, and multimodal understanding. It sits in the same general tier as Claude 4.5 Haiku and NVIDIA Nemotron 3 Super, with native function calling and structured JSON output for agentic workflows.
The model features strong image and video understanding for tasks like OCR and chart analysis, 256K context for long documents and repositories, and support for 140+ languages — all under the permissive Apache 2.0 license.
Dense Transformer — 31B Parameters
Built for multimodal workloads
Agentic Workflows
Native function calling and structured JSON output for autonomous multi-step agent pipelines with self-correction.
Vision & OCR
Strong image and video understanding for chart analysis, document OCR, and multimodal reasoning tasks.
Long-Context Processing
256K context window supports full-repository code analysis, long legal documents, and multi-document synthesis.
Coding & Reasoning
Advanced code generation, debugging, and scientific reasoning with strong benchmark performance across GPQA and SciCode.
Frontier-Class Intelligence
Artificial Analysis Intelligence Index v4.0 scores for the 31B weight class.
Intelligence Index
Better than 78% of models
GPQA Diamond
Better than 90% of models
τ²-Bench Telecom
Better than 70% of models
| Category | Benchmark | Score |
|---|---|---|
| Reasoning | GPQA Diamond | 86% |
| Reasoning | Humanity's Last Exam | 23% |
| Reasoning | τ²-Bench Telecom | 65% |
| Reasoning | AA-LCR | 60% |
| Reasoning | IFBench | 76% |
| Reasoning | GDPval-AA | 39.2% |
| Coding | SciCode | 43% |
| Coding | Terminal-Bench Hard | 36% |
| Knowledge | AA-Omniscience Accuracy | 20% |
| Knowledge | AA-Omniscience Non-Hallucination | 19% |
Metrics sourced from Artificial Analysis. Evaluated in reasoning mode.
Flexible Pricing Tiers
Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.
| Tier | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Standard | $0.07 | $0.20 |
| Async | $0.11 | $0.30 |
| Realtime | $0.14 | $0.40 |
Context window natively supported up to 256K tokens.
Start Building in Minutes
Gemma 4 31B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.
from openai import OpenAI
client = OpenAI(
api_key="your-api-key-here",
base_url="https://api.doubleword.ai/v1"
)
# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
batch_file = client.files.create(
file=file,
purpose="batch"
)
print(f"File ID: {batch_file.id}")
# Step 2: Create a batch job
batch = client.batches.create(
input_file_id=batch_file.id,
endpoint="/v1/chat/completions",
completion_window="24h"
)
print(f"Batch ID: {batch.id}")
# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")💡 Pro Tip
Gemma 4 supports native function calling — pass your tools in the tools parameter for structured agentic workflows with JSON schema validation.
