New

NVIDIA

Open Weights

Reasoning

550B / 55B Active

Nemotron-3-Ultra-550B-A55B

NVIDIA's strongest open-weights reasoning model — positioned near GPT-5.4 Mini (xhigh) and ahead of DeepSeek V4-Flash and Qwen3.5-397B-A17B.

Get API Key Test in Playground

Total Parameters

550B

55B Active

Context Window

262K

Tokens

Released

Jun 2026

Open Weights

Architecture

MoE

Reasoning

About

NVIDIA's Flagship Open Reasoning Model

NVIDIA Nemotron 3 Ultra is the strongest open-weights model in the Nemotron 3 family — 550B total parameters with 55B active, purpose-built for high-stakes reasoning, agentic workflows, tool use, and multilingual instruction following. It lands near GPT-5.4 Mini (xhigh) on the Artificial Analysis Intelligence Index and ahead of DeepSeek V4-Flash and Qwen3.5-397B-A17B, while staying open and self-hostable.

MoE — 55B Active of 550B

Best for

Built for high-stakes reasoning

Agentic Workflows

Long-horizon planning, self-correction, and autonomous decision-making for multi-step agent stacks.

Tool Use

Reliable native function calling and tool orchestration for production agentic pipelines.

High-Stakes RAG

Grounded answers over large knowledge bases — 262K context for whole-corpus reasoning.

Complex Instruction Following

Robust adherence to nested, multi-constraint instructions across multilingual prompts.

Benchmarks

Frontier-class open intelligence

Artificial Analysis Intelligence Index v4.0 — Nemotron 3 Ultra vs comparable open & closed models.

Intelligence Index

AA Intelligence Index v4.0

GPQA Diamond

AA Intelligence Index v4.0

IFBench

AA Intelligence Index v4.0

Category	Benchmark	Score	Description
Agentic	GDPval-AA	44%	Agentic performance on real-world work tasks
Agentic	τ²-Bench Telecom	83%	AI agents in dual-control scenarios
Coding	Terminal-Bench Hard	36%	Agentic coding & terminal use
Coding	SciCode	40%	Python for scientific computing
Reasoning	AA-LCR	67%	Long context reasoning evaluation
Reasoning	GPQA Diamond	87%	Graduate-level scientific reasoning
Reasoning	Humanity's Last Exam	27%	Reasoning & knowledge
Instruction	IFBench	81%	Instruction-following accuracy
Knowledge	AA-Omniscience Accuracy	22%	Proportion of correctly answered questions
Knowledge	AA-Omniscience Non-Hallucination	71%	1 − hallucination rate

Metrics sourced from Artificial Analysis Intelligence Index v4.0. Evaluated in regular (highest-effort) reasoning mode.

Pricing

Pick your delivery window

Same model, three speeds. Prices are per 1M tokens.

Tier	Input / 1M tokens	Output / 1M tokens
BatchSave 50%	$0.25	$1.25
AsyncSave 26%	$0.37	$1.87
Realtime	$0.50	$2.50

Context window natively supported up to 262K tokens.

Quickstart

Start Building in Minutes

Nemotron-3-Ultra-550B is accessible via OpenAI-compatible endpoints on Doubleword.ai.

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key-here",
    base_url="https://api.doubleword.ai/v1"
)

response = client.chat.completions.create(
    model="nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B",
    messages=[
        {"role": "user", "content": "Plan a 3-step research workflow for evaluating an open-weights LLM."}
    ],
)

print(response.choices[0].message.content)

Ready to deploy Nemotron-3-Ultra-550B?

Get Your API Keys Read the Full Documentation