NVIDIA: Nemotron Nano 9B V2

Name: NVIDIA: Nemotron Nano 9B V2
Brand: nvidia
Availability: InStock
Rating: 2.8 (6 reviews)

Back

Text input Text output

Author's Description

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.

Key Specifications

Cost

Context

128K

Parameters

Released

Sep 05, 2025

Speed

★

Ability

★★★

Reliability

★★★

Hugging Face

Supported Parameters

This model supports the following parameters:

Tools Include Reasoning Reasoning Response Format Structured Outputs Tool Choice

Features

This model supports the following features:

Structured Outputs Response Format Reasoning Tools

Performance Summary

NVIDIA's Nemotron Nano 9B V2, created on September 5, 2025, is a 128,000 context length LLM designed for both reasoning and non-reasoning tasks, featuring a controllable reasoning trace. This model consistently ranks among the fastest available, demonstrating exceptional speed across all evaluated benchmarks. While specific pricing data is unavailable, suggesting potential free-tier usage, its reliability is outstanding, boasting a 99% success rate with minimal technical failures. In terms of performance, Nemotron Nano 9B V2 exhibits a notable strength in Ethics, achieving perfect 100% accuracy, making it the most accurate model at its speed. It also performs well in Coding and Reasoning, with 83.0% and 73.1% accuracy respectively, indicating solid capabilities in these areas. However, a significant weakness is apparent in Instruction Following, where it scored 0.0% accuracy, suggesting a critical area for improvement. Email Classification shows moderate performance at 94.1% accuracy. Overall, the model excels in speed and reliability, with strong ethical and reasoning foundations, but requires substantial development in instruction adherence.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0
Completion	$0

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
Nvidia	Nvidia \| nvidia/nemotron-nano-9b-v2	128K	$0 / 1M tokens	$0 / 1M tokens
Nvidia	Nvidia \| nvidia/nemotron-nano-9b-v2	128K	$0 / 1M tokens	$0 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by nvidia

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
NVIDIA: Llama 3.3 Nemotron Super 49B v1	Apr 08, 2025	49B	131K	Text input Text output	★★★	★★	$$
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1	Apr 08, 2025	253B	131K	Text input Text output	★	★★	$$$$$
NVIDIA: Llama 3.1 Nemotron 70B Instruct	Oct 14, 2024	70B	131K	Text input Text output	★★★	★★	$$