Dense Transformer

Multimodal

Open Weights

Google

Apache 2.0

Gemma 4 31B

Google DeepMind's most capable open model — advanced reasoning, coding, and multimodal understanding with native function calling and 256K context.

Get API Key Test in Playground

Total Parameters

31B

Dense

Context Window

256K

Tokens

Intelligence

AA Index v4.0

License

Apache 2.0

Open Weights

About

Google's Most Capable Open Model

Gemma 4 31B is Google DeepMind's most capable open model, built for advanced reasoning, coding, and multimodal understanding. It sits in the same general tier as Claude 4.5 Haiku and NVIDIA Nemotron 3 Super, with native function calling and structured JSON output for agentic workflows.

The model features strong image and video understanding for tasks like OCR and chart analysis, 256K context for long documents and repositories, and support for 140+ languages — all under the permissive Apache 2.0 license.

Text

Image

Video

JSON

Code

140+

Dense Transformer — 31B Parameters

Use Cases

Built for multimodal workloads

Agentic Workflows

Native function calling and structured JSON output for autonomous multi-step agent pipelines with self-correction.

Vision & OCR

Strong image and video understanding for chart analysis, document OCR, and multimodal reasoning tasks.

Long-Context Processing

256K context window supports full-repository code analysis, long legal documents, and multi-document synthesis.

Coding & Reasoning

Advanced code generation, debugging, and scientific reasoning with strong benchmark performance across GPQA and SciCode.

Benchmarks

Frontier-Class Intelligence

Artificial Analysis Intelligence Index v4.0 scores for the 31B weight class.

Intelligence Index

Better than 78% of models

GPQA Diamond

Better than 90% of models

τ²-Bench Telecom

Better than 70% of models

Category	Benchmark	Score	Description
Reasoning	GPQA Diamond	86%	Graduate-level scientific reasoning
Reasoning	Humanity's Last Exam	23%	Humanity's Last Exam
Reasoning	τ²-Bench Telecom	65%	AI agents in dual-control scenarios
Reasoning	AA-LCR	60%	Long context reasoning evaluation
Reasoning	IFBench	76%	Instruction-following accuracy
Reasoning	GDPval-AA	39.2%	Agentic performance on real-world work tasks
Coding	SciCode	43%	Python for scientific computing
Coding	Terminal-Bench Hard	36%	Agentic coding & terminal use
Knowledge	AA-Omniscience Accuracy	20%	Proportion of correctly answered questions
Knowledge	AA-Omniscience Non-Hallucination	19%	Confidently answered questions that are correct

Metrics sourced from Artificial Analysis. Evaluated in reasoning mode.

Pricing

Flexible Pricing Tiers

Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

Tier	Input / 1M tokens	Output / 1M tokens
Standard	$0.07	$0.20
Async	$0.11	$0.30
Realtime	$0.14	$0.40

Context window natively supported up to 256K tokens.

Quickstart

Start Building in Minutes

Gemma 4 31B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key-here",
    base_url="https://api.doubleword.ai/v1"
)

# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
    batch_file = client.files.create(
        file=file,
        purpose="batch"
    )

print(f"File ID: {batch_file.id}")

# Step 2: Create a batch job
batch = client.batches.create(
    input_file_id=batch_file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

print(f"Batch ID: {batch.id}")

# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")

💡 Pro Tip

Gemma 4 supports native function calling — pass your tools in the tools parameter for structured agentic workflows with JSON schema validation.

Ready to deploy Gemma 4 31B?

Get Your API Keys Read the Full Documentation