Author's Description
NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
NVIDIA: Nemotron Nano 9B V2 demonstrates exceptional speed, consistently ranking among the fastest models. It also offers competitive pricing, typically providing cost-effective solutions. The model exhibits outstanding reliability with a 99% success rate, indicating minimal technical failures. In terms of benchmark performance, Nemotron Nano 9B V2 shows strong capabilities in several areas. It achieves high accuracy in Hallucinations (98.0%), General Knowledge (98.6%), and particularly excels in Ethics with a perfect 100% accuracy, making it the most accurate among models of comparable speed. Its Reasoning capabilities are also a significant strength, scoring 89.8% accuracy and ranking in the top 3 for cost efficiency in this category. However, the model exhibits notable weaknesses in Instruction Following and Mathematics, where it scored 0.0% accuracy in both benchmarks. Coding performance is moderate at 83.0% accuracy, while Email Classification is 94.1%, placing it in the lower percentile for this task. The model's design, which involves generating a reasoning trace before a final response, is a unique feature that can be controlled via system prompts.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.04 |
| Completion | $0.16 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
Nvidia
|
Nvidia | nvidia/nemotron-nano-9b-v2 | 128K | $0.04 / 1M tokens | $0.16 / 1M tokens |
|
Nvidia
|
Nvidia | nvidia/nemotron-nano-9b-v2 | 128K | $0.04 / 1M tokens | $0.16 / 1M tokens |
|
DeepInfra
|
DeepInfra | nvidia/nemotron-nano-9b-v2 | 131K | $0.04 / 1M tokens | $0.16 / 1M tokens |
|
Together
|
Together | nvidia/nemotron-nano-9b-v2 | 131K | $0.06 / 1M tokens | $0.25 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by nvidia
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| NVIDIA: Nemotron Nano 12B 2 VL | Oct 28, 2025 | 12B | 131K |
Image input
Text input
Text output
|
★ | ★★ | $$$$ |
| NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 | Oct 10, 2025 | 49B | 131K |
Text input
Text output
|
★ | ★★★★ | $$$$ |
| NVIDIA: Llama 3.3 Nemotron Super 49B v1 Unavailable | Apr 08, 2025 | 49B | 131K |
Text input
Text output
|
★★★ | ★★ | $$ |
| NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 | Apr 08, 2025 | 253B | 131K |
Text input
Text output
|
★ | ★★ | $$$$$ |
| NVIDIA: Llama 3.1 Nemotron 70B Instruct | Oct 14, 2024 | 70B | 131K |
Text input
Text output
|
★★★ | ★★ | $$$ |