Author's Description
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
NVIDIA Nemotron 3 Ultra demonstrates moderate speed performance, ranking in the 31st percentile across benchmarks, and offers competitive pricing, placing it in the 50th percentile. The model excels in specific areas, achieving perfect accuracy in both Instruction Following and Email Classification. For these tasks, it stands out as the most accurate model at its price point and among models of comparable speed. However, its performance varies significantly across other categories. It shows a reasonable ability to acknowledge uncertainty, with 95.2% accuracy in the Hallucinations benchmark. Conversely, the model exhibits notable weaknesses in complex reasoning tasks, scoring only 39.3% in Reasoning, and particularly struggles with specialized knowledge and problem-solving. Its accuracy in Coding (22.9%), General Knowledge (28.6%), Ethics (7.7%), and Mathematics (15.0%) is considerably low, placing it in the lower percentiles for these categories. The model's architecture, a hybrid Transformer-Mamba mixture-of-experts with 55B active parameters, suggests a design optimized for certain types of tasks, while indicating areas for further development in others.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.5 |
| Completion | $2.5 |
| Input Cache Read | $0.15 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
DeepInfra
|
DeepInfra | nvidia/nemotron-3-ultra-550b-a55b-20260604 | 262K | $0.5 / 1M tokens | $2.5 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by nvidia
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| NVIDIA: Nemotron 3 Super | Mar 11, 2026 | 120B | 262K |
Text input
Text output
|
★★★ | ★★★ | $$$$ |
| NVIDIA: Nemotron 3 Nano 30B A3B | Dec 14, 2025 | 30B | 262K |
Text input
Text output
|
★★★ | ★★★★★ | $$$ |
| NVIDIA: Nemotron Nano 12B 2 VL Unavailable | Oct 28, 2025 | 12B | 131K |
Video input
Text input
Image input
Text output
|
★ | ★★ | $$$$ |
| NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 | Oct 10, 2025 | 49B | 131K |
Text input
Text output
|
★★ | ★★★★ | $$$$ |
| NVIDIA: Nemotron Nano 9B V2 | Sep 05, 2025 | 9B | 128K |
Text input
Text output
|
★ | ★★ | $ |
| NVIDIA: Llama 3.3 Nemotron Super 49B v1 Unavailable | Apr 08, 2025 | 49B | 131K |
Text input
Text output
|
★★★ | ★★ | $$ |
| NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 Unavailable | Apr 08, 2025 | 253B | 131K |
Text input
Text output
|
★ | ★★ | $$$$ |
| NVIDIA: Llama 3.1 Nemotron 70B Instruct Unavailable | Oct 14, 2024 | 70B | 131K |
Text input
Text output
|
★★★ | ★★ | $$ |