Mistral: Voxtral Small 24B 2507

Audio input Text input Text output
Author's Description

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio is priced at $100 per million seconds.

Key Specifications
Cost
$$
Context
32K
Parameters
24B
Released
Oct 30, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Tools Frequency Penalty Presence Penalty Top P Response Format Tool Choice Temperature Seed Structured Outputs Max Tokens
Features

This model supports the following features:

Response Format Structured Outputs Tools
Performance Summary

Mistral: Voxtral Small 24B 2507, created by mistralai, is a robust AI model with a 32000 context length, notable for its integration of state-of-the-art audio input capabilities alongside strong text performance. The model consistently ranks among the fastest, achieving the 89th percentile across seven benchmarks, and offers highly competitive pricing, placing in the 81st percentile. Demonstrating exceptional reliability, it boasts a 100% success rate across all benchmarks, indicating minimal technical failures. In terms of performance across categories, Voxtral Small exhibits strong general knowledge (98.5% accuracy) and ethical reasoning (98.0% accuracy), both with competitive costs. Its email classification (97.0% accuracy) is also solid. However, a significant weakness is its performance in hallucination detection, with only 72.0% accuracy, placing it in the 18th percentile. Instruction following (51.0% accuracy) and complex reasoning (64.0% accuracy) are moderate. While its coding accuracy is 80.0%, its duration for several benchmarks, including Hallucinations, Reasoning, and Coding, is notably high, ranking in the 98th percentile for duration. Its key strengths lie in its speed, cost-effectiveness, and reliability, coupled with its unique audio processing capabilities.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.3

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Mistral
Mistral | mistralai/voxtral-small-24b-2507 32K $0.1 / 1M tokens $0.3 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by mistralai