Doubleword
    Dense Transformer
    Multimodal
    Open Weights
    Google
    Apache 2.0

    Gemma 4 31B

    Google DeepMind's most capable open model — advanced reasoning, coding, and multimodal understanding with native function calling and 256K context.

    Total Parameters

    31B

    Dense

    Context Window

    256K

    Tokens

    Intelligence

    39

    AA Index v4.0

    License

    Apache 2.0

    Open Weights

    About

    Google's Most Capable Open Model

    Gemma 4 31B is Google DeepMind's most capable open model, built for advanced reasoning, coding, and multimodal understanding. It sits in the same general tier as Claude 4.5 Haiku and NVIDIA Nemotron 3 Super, with native function calling and structured JSON output for agentic workflows.

    The model features strong image and video understanding for tasks like OCR and chart analysis, 256K context for long documents and repositories, and support for 140+ languages — all under the permissive Apache 2.0 license.

    Text
    Image
    Video
    JSON
    Code
    140+

    Dense Transformer — 31B Parameters

    Use Cases

    Built for multimodal workloads

    Agentic Workflows

    Native function calling and structured JSON output for autonomous multi-step agent pipelines with self-correction.

    Vision & OCR

    Strong image and video understanding for chart analysis, document OCR, and multimodal reasoning tasks.

    Long-Context Processing

    256K context window supports full-repository code analysis, long legal documents, and multi-document synthesis.

    Coding & Reasoning

    Advanced code generation, debugging, and scientific reasoning with strong benchmark performance across GPQA and SciCode.

    Benchmarks

    Frontier-Class Intelligence

    Artificial Analysis Intelligence Index v4.0 scores for the 31B weight class.

    39

    Intelligence Index

    Better than 78% of models

    86

    GPQA Diamond

    Better than 90% of models

    65

    τ²-Bench Telecom

    Better than 70% of models

    CategoryBenchmarkScore
    ReasoningGPQA Diamond86%
    ReasoningHumanity's Last Exam23%
    Reasoningτ²-Bench Telecom65%
    ReasoningAA-LCR60%
    ReasoningIFBench76%
    ReasoningGDPval-AA39.2%
    CodingSciCode43%
    CodingTerminal-Bench Hard36%
    KnowledgeAA-Omniscience Accuracy20%
    KnowledgeAA-Omniscience Non-Hallucination19%

    Metrics sourced from Artificial Analysis. Evaluated in reasoning mode.

    Pricing

    Flexible Pricing Tiers

    Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

    TierInput / 1M tokensOutput / 1M tokens
    Standard$0.07$0.20
    Async$0.11$0.30
    Realtime$0.14$0.40

    Context window natively supported up to 256K tokens.

    Quickstart

    Start Building in Minutes

    Gemma 4 31B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.

    Python
    from openai import OpenAI
    
    client = OpenAI(
        api_key="your-api-key-here",
        base_url="https://api.doubleword.ai/v1"
    )
    
    # Step 1: Upload a batch input file
    with open("batch_requests.jsonl", "rb") as file:
        batch_file = client.files.create(
            file=file,
            purpose="batch"
        )
    
    print(f"File ID: {batch_file.id}")
    
    # Step 2: Create a batch job
    batch = client.batches.create(
        input_file_id=batch_file.id,
        endpoint="/v1/chat/completions",
        completion_window="24h"
    )
    
    print(f"Batch ID: {batch.id}")
    
    # Step 3: Check batch status
    batch_status = client.batches.retrieve(batch.id)
    print(f"Status: {batch_status.status}")

    💡 Pro Tip

    Gemma 4 supports native function calling — pass your tools in the tools parameter for structured agentic workflows with JSON schema validation.