NVIDIA: Nemotron 3 Ultra (free)

Name: NVIDIA: Nemotron 3 Ultra (free)
Brand: nvidia
Price: 6e-7 USD
Availability: InStock
Rating: 1.6 (8 reviews)

Back

Text input Text output

Author's Description

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

Key Specifications

Cost

$$$$

Context

512K

Parameters

550B

Released

Jun 03, 2026

Speed

★

Ability

★

Reliability

★

Hugging Face

Supported Parameters

This model supports the following parameters:

Stop Max Tokens Structured Outputs Seed Reasoning Top P Frequency Penalty Presence Penalty Temperature Include Reasoning Logit Bias Tools Tool Choice Response Format Min P

Features

This model supports the following features:

Response Format Tools Structured Outputs Reasoning

Performance Summary

NVIDIA Nemotron 3 Ultra demonstrates moderate speed performance, ranking in the 31st percentile across benchmarks, and offers competitive pricing, placing it in the 50th percentile. The model excels in specific areas, achieving perfect accuracy in both Instruction Following and Email Classification. For these tasks, it stands out as the most accurate model at its price point and among models of comparable speed. However, its performance varies significantly across other categories. It shows a reasonable ability to acknowledge uncertainty, with 95.2% accuracy in the Hallucinations benchmark. Conversely, the model exhibits notable weaknesses in complex reasoning tasks, scoring only 39.3% in Reasoning, and particularly struggles with specialized knowledge and problem-solving. Its accuracy in Coding (22.9%), General Knowledge (28.6%), Ethics (7.7%), and Mathematics (15.0%) is considerably low, placing it in the lower percentiles for these categories. The model's architecture, a hybrid Transformer-Mamba mixture-of-experts with 55B active parameters, suggests a design optimized for certain types of tasks, while indicating areas for further development in others.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0.6
Completion	$3.6
Input Cache Read	$0.2

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
DeepInfra	DeepInfra \| nvidia/nemotron-3-ultra-550b-a55b-20260604	262K	$0 / 1M tokens	$0 / 1M tokens
Together	Together \| nvidia/nemotron-3-ultra-550b-a55b-20260604	512K	$0.6 / 1M tokens	$3.6 / 1M tokens
Nebius	Nebius \| nvidia/nemotron-3-ultra-550b-a55b-20260604	8K	$0 / 1M tokens	$0 / 1M tokens
DeepInfra	DeepInfra \| nvidia/nemotron-3-ultra-550b-a55b-20260604	262K	$0.5 / 1M tokens	$2.2 / 1M tokens
Venice	Venice \| nvidia/nemotron-3-ultra-550b-a55b-20260604	256K	$0.625 / 1M tokens	$3.13 / 1M tokens
BaseTen	BaseTen \| nvidia/nemotron-3-ultra-550b-a55b-20260604	202K	$0.6 / 1M tokens	$2.4 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by nvidia

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
NVIDIA: Nemotron 3.5 Content Safety (free) Unavailable	Jun 04, 2026	~4B	N/A	Image input Text input Text output	—	—	—
NVIDIA: Nemotron 3 Nano Omni (free) Unavailable	Apr 28, 2026	30B	N/A	Image input Audio input Text input Video input Text output	—	—	—
NVIDIA: Nemotron 3 Super (free)	Mar 11, 2026	120B	262K	Text input Text output	★★★	★★★	$$$$
NVIDIA: Nemotron 3 Nano 30B A3B (free)	Dec 14, 2025	30B	262K	Text input Text output	★★★	★★★★★	$$$
NVIDIA: Nemotron Nano 12B 2 VL (free)	Oct 28, 2025	12B	131K	Image input Text input Video input Text output	★	★★	$$$$
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 Unavailable	Oct 10, 2025	49B	131K	Text input Text output	★★	★★★★	$$$$
NVIDIA: Nemotron Nano 9B V2 (free)	Sep 05, 2025	9B	128K	Text input Text output	★	★★	$
NVIDIA: Llama 3.3 Nemotron Super 49B v1 Unavailable	Apr 08, 2025	49B	131K	Text input Text output	★★★★	★★	$$
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 Unavailable	Apr 08, 2025	253B	131K	Text input Text output	★★	★★	$$$$
NVIDIA: Llama 3.1 Nemotron 70B Instruct Unavailable	Oct 14, 2024	70B	131K	Text input Text output	★★★	★★	$$