NVIDIA: Nemotron Nano 12B 2 VL

Text input Image input Video input Text output Free Option
Author's Description

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s...

Key Specifications
Cost
$$$$
Context
131K
Parameters
12B
Released
Oct 28, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Seed Frequency Penalty Top P Min P Response Format Reasoning Temperature Stop Presence Penalty Include Reasoning Max Tokens
Features

This model supports the following features:

Response Format Reasoning
Performance Summary

NVIDIA Nemotron Nano 2 VL, a 12-billion-parameter open multimodal reasoning model, demonstrates exceptional speed, consistently ranking among the fastest models across all evaluated benchmarks. It also offers highly competitive pricing, placing it among the most cost-effective options available. The model exhibits strong reliability with a 91% success rate, indicating consistent and usable responses. In terms of performance across categories, Nemotron Nano 2 VL excels in General Knowledge and Ethics, achieving near-perfect scores of 99.5% and 100% respectively. Its 100% accuracy in Ethics is particularly noteworthy, making it the most accurate model at its price point and among models of similar speed. The model also shows strong capabilities in Reasoning (94.0% accuracy) and Email Classification (98.0% accuracy). However, it exhibits significant weaknesses in Instruction Following, scoring 0.0% accuracy, and performs below average in Mathematics (52.6% accuracy) and Coding (65.6% accuracy). Its ability to acknowledge uncertainty (Hallucinations baseline) is moderate at 80.0%. Designed for video understanding and document intelligence, its hybrid Transformer-Mamba architecture contributes to its high throughput and low latency, supporting its strong speed performance.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.2
Completion $0.6

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | nvidia/nemotron-nano-12b-v2-vl 131K $0.2 / 1M tokens $0.6 / 1M tokens
Nebius
Nebius | nvidia/nemotron-nano-12b-v2-vl 131K $0.2 / 1M tokens $0.6 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by nvidia