NVIDIA: Nemotron 3 Super

Text input Text output
Author's Description

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

Key Specifications
Cost
$$$$
Context
262K
Parameters
120B
Released
Mar 11, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Temperature Include Reasoning Reasoning Presence Penalty Max Tokens Response Format Frequency Penalty Top P
Features

This model supports the following features:

Reasoning Response Format
Performance Summary

NVIDIA Nemotron 3 Super, a 120B-parameter open hybrid MoE model, demonstrates competitive response times, ranking in the 53rd percentile for speed across various benchmarks. Its pricing is moderate, placing it in the 37th percentile. Notably, the model exhibits exceptional reliability with a 98% success rate, indicating consistent and usable responses. In terms of performance across categories, Nemotron 3 Super shows a significant strength in Reasoning, achieving 94.0% accuracy (80th percentile), and strong ethical adherence with 99.0% accuracy (53rd percentile). Its multi-environment RL training appears to contribute to these areas. However, the model struggles with Instruction Following (33.0% accuracy, 28th percentile) and Email Classification (89.0% accuracy, 11th percentile), suggesting areas for improvement in precise directive execution and nuanced categorization. Hallucinations are also a notable weakness, with 86.0% accuracy (31st percentile) indicating a tendency to generate information rather than acknowledge uncertainty. Coding (80.0% accuracy) and Mathematics (83.0% accuracy) show moderate performance, while General Knowledge (91.5% accuracy) is also in the lower percentile. The model's hybrid Mamba-Transformer architecture and 1M token context window are designed for complex multi-agent applications, and its open nature allows for customization.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.09
Completion $0.45

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Nebius
Nebius | nvidia/nemotron-3-super-120b-a12b-20230311 262K $0.09 / 1M tokens $0.45 / 1M tokens
DeepInfra
DeepInfra | nvidia/nemotron-3-super-120b-a12b-20230311 262K $0.1 / 1M tokens $0.5 / 1M tokens
Nebius
Nebius | nvidia/nemotron-3-super-120b-a12b-20230311 262K $0.3 / 1M tokens $0.9 / 1M tokens
DekaLLM
DekaLLM | nvidia/nemotron-3-super-120b-a12b-20230311 262K $0.09 / 1M tokens $0.45 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by nvidia