NVIDIA: Nemotron Nano 9B V2

Text input Text output
Author's Description

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.

Key Specifications
Cost
$
Context
128K
Parameters
9B
Released
Sep 05, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Include Reasoning Reasoning Response Format Structured Outputs Tool Choice
Features

This model supports the following features:

Structured Outputs Response Format Reasoning Tools
Performance Summary

NVIDIA's Nemotron Nano 9B V2, created on September 5, 2025, is a 128,000 context length LLM designed for both reasoning and non-reasoning tasks, featuring a controllable reasoning trace. This model consistently ranks among the fastest available, demonstrating exceptional speed across all evaluated benchmarks. While specific pricing data is unavailable, suggesting potential free-tier usage, its reliability is outstanding, boasting a 99% success rate with minimal technical failures. In terms of performance, Nemotron Nano 9B V2 exhibits a notable strength in Ethics, achieving perfect 100% accuracy, making it the most accurate model at its speed. It also performs well in Coding and Reasoning, with 83.0% and 73.1% accuracy respectively, indicating solid capabilities in these areas. However, a significant weakness is apparent in Instruction Following, where it scored 0.0% accuracy, suggesting a critical area for improvement. Email Classification shows moderate performance at 94.1% accuracy. Overall, the model excels in speed and reliability, with strong ethical and reasoning foundations, but requires substantial development in instruction adherence.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0
Completion $0

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Nvidia
Nvidia | nvidia/nemotron-nano-9b-v2 128K $0 / 1M tokens $0 / 1M tokens
Nvidia
Nvidia | nvidia/nemotron-nano-9b-v2 128K $0 / 1M tokens $0 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by nvidia