Doubleword
    New
    NVIDIA
    Open Weights
    Reasoning
    550B / 55B Active

    Nemotron-3-Ultra-550B-A55B

    NVIDIA's strongest open-weights reasoning model — positioned near GPT-5.4 Mini (xhigh) and ahead of DeepSeek V4-Flash and Qwen3.5-397B-A17B.

    Total Parameters

    550B

    55B Active

    Context Window

    262K

    Tokens

    Released

    Jun 2026

    Open Weights

    Architecture

    MoE

    Reasoning

    About

    NVIDIA's Flagship Open Reasoning Model

    NVIDIA Nemotron 3 Ultra is the strongest open-weights model in the Nemotron 3 family — 550B total parameters with 55B active, purpose-built for high-stakes reasoning, agentic workflows, tool use, and multilingual instruction following. It lands near GPT-5.4 Mini (xhigh) on the Artificial Analysis Intelligence Index and ahead of DeepSeek V4-Flash and Qwen3.5-397B-A17B, while staying open and self-hostable.

    E
    E
    E
    E

    MoE — 55B Active of 550B

    Best for

    Built for high-stakes reasoning

    Agentic Workflows

    Long-horizon planning, self-correction, and autonomous decision-making for multi-step agent stacks.

    Tool Use

    Reliable native function calling and tool orchestration for production agentic pipelines.

    High-Stakes RAG

    Grounded answers over large knowledge bases — 262K context for whole-corpus reasoning.

    Complex Instruction Following

    Robust adherence to nested, multi-constraint instructions across multilingual prompts.

    Benchmarks

    Frontier-class open intelligence

    Artificial Analysis Intelligence Index v4.0 — Nemotron 3 Ultra vs comparable open & closed models.

    48

    Intelligence Index

    AA Intelligence Index v4.0

    87

    GPQA Diamond

    AA Intelligence Index v4.0

    81

    IFBench

    AA Intelligence Index v4.0

    CategoryBenchmarkScore
    AgenticGDPval-AA44%
    Agenticτ²-Bench Telecom83%
    CodingTerminal-Bench Hard36%
    CodingSciCode40%
    ReasoningAA-LCR67%
    ReasoningGPQA Diamond87%
    ReasoningHumanity's Last Exam27%
    InstructionIFBench81%
    KnowledgeAA-Omniscience Accuracy22%
    KnowledgeAA-Omniscience Non-Hallucination71%

    Metrics sourced from Artificial Analysis Intelligence Index v4.0. Evaluated in regular (highest-effort) reasoning mode.

    Pricing

    Pick your delivery window

    Same model, three speeds. Prices are per 1M tokens.

    TierInput / 1M tokensOutput / 1M tokens
    BatchSave 50%$0.25$1.25
    AsyncSave 26%$0.37$1.87
    Realtime$0.50$2.50

    Context window natively supported up to 262K tokens.

    Quickstart

    Start Building in Minutes

    Nemotron-3-Ultra-550B is accessible via OpenAI-compatible endpoints on Doubleword.ai.

    Python
    from openai import OpenAI
    
    client = OpenAI(
        api_key="your-api-key-here",
        base_url="https://api.doubleword.ai/v1"
    )
    
    response = client.chat.completions.create(
        model="nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B",
        messages=[
            {"role": "user", "content": "Plan a 3-step research workflow for evaluating an open-weights LLM."}
        ],
    )
    
    print(response.choices[0].message.content)

    Ready to deploy Nemotron-3-Ultra-550B?