NVIDIA: Nemotron 3 Super

Name: NVIDIA: Nemotron 3 Super
Brand: nvidia
Price: 9e-8 USD
Availability: InStock
Rating: 2.9 (8 reviews)

Back

Text input Text output

Author's Description

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

Key Specifications

Cost

$$$$

Context

262K

Parameters

120B

Released

Mar 11, 2026

Speed

★★★

Ability

★★★

Reliability

★★

Hugging Face

Supported Parameters

This model supports the following parameters:

Include Reasoning Response Format Temperature Max Tokens Reasoning Presence Penalty Top P Frequency Penalty

Features

This model supports the following features:

Reasoning Response Format

Performance Summary

NVIDIA Nemotron 3 Super, a 120B-parameter open hybrid MoE model, demonstrates competitive response times, ranking in the 53rd percentile for speed across various benchmarks. Its pricing is moderate, placing it in the 37th percentile. Notably, the model exhibits exceptional reliability with a 98% success rate, indicating consistent and usable responses. In terms of performance across categories, Nemotron 3 Super shows a significant strength in Reasoning, achieving 94.0% accuracy (80th percentile), and strong ethical adherence with 99.0% accuracy (53rd percentile). Its multi-environment RL training appears to contribute to these areas. However, the model struggles with Instruction Following (33.0% accuracy, 28th percentile) and Email Classification (89.0% accuracy, 11th percentile), suggesting areas for improvement in precise directive execution and nuanced categorization. Hallucinations are also a notable weakness, with 86.0% accuracy (31st percentile) indicating a tendency to generate information rather than acknowledge uncertainty. Coding (80.0% accuracy) and Mathematics (83.0% accuracy) show moderate performance, while General Knowledge (91.5% accuracy) is also in the lower percentile. The model's hybrid Mamba-Transformer architecture and 1M token context window are designed for complex multi-agent applications, and its open nature allows for customization.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0.09
Completion	$0.45

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
Nebius	Nebius \| nvidia/nemotron-3-super-120b-a12b-20230311	262K	$0.09 / 1M tokens	$0.45 / 1M tokens
DeepInfra	DeepInfra \| nvidia/nemotron-3-super-120b-a12b-20230311	262K	$0.1 / 1M tokens	$0.5 / 1M tokens
Nebius	Nebius \| nvidia/nemotron-3-super-120b-a12b-20230311	262K	$0.3 / 1M tokens	$0.9 / 1M tokens
DekaLLM	DekaLLM \| nvidia/nemotron-3-super-120b-a12b-20230311	262K	$0.09 / 1M tokens	$0.45 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by nvidia

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
NVIDIA: Nemotron 3 Nano 30B A3B	Dec 14, 2025	30B	262K	Text input Text output	★★★	★★★★★	$$$
NVIDIA: Nemotron Nano 12B 2 VL	Oct 28, 2025	12B	131K	Image input Video input Text input Text output	★	★★	$$$$
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5	Oct 10, 2025	49B	131K	Text input Text output	★★	★★★★	$$$$
NVIDIA: Nemotron Nano 9B V2	Sep 05, 2025	9B	128K	Text input Text output	★	★★	$
NVIDIA: Llama 3.3 Nemotron Super 49B v1 Unavailable	Apr 08, 2025	49B	131K	Text input Text output	★★★★	★★	$$
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 Unavailable	Apr 08, 2025	253B	131K	Text input Text output	★	★★	$$$$
NVIDIA: Llama 3.1 Nemotron 70B Instruct	Oct 14, 2024	70B	131K	Text input Text output	★★★	★★	$$