Mistral: Voxtral Small 24B 2507

Audio input Text input Text output
Author's Description

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio is priced at $100 per million seconds.

Key Specifications
Cost
$$
Context
32K
Parameters
24B
Released
Oct 30, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Structured Outputs Response Format Presence Penalty Seed Frequency Penalty Temperature Top P Max Tokens Tool Choice Tools
Features

This model supports the following features:

Response Format Tools Structured Outputs
Performance Summary

Mistral: Voxtral Small 24B 2507, created by mistralai, is a notable AI model with a 32000 context length, distinguished by its integration of state-of-the-art audio input capabilities alongside strong text performance. This model consistently performs among the fastest, ranking in the 89th percentile across seven benchmarks, and offers highly competitive pricing, placing in the 81st percentile. Its reliability is exceptional, demonstrating a 100% success rate across all benchmarks, indicating minimal technical failures. In terms of benchmark performance, Voxtral Small 24B 2507 exhibits a strong grasp of General Knowledge (98.5% accuracy) and Ethics (98.0% accuracy), both with favorable cost and duration metrics. It also performs well in Email Classification (97.0% accuracy). A key strength lies in its audio capabilities, excelling at speech transcription, translation, and audio understanding. However, the model shows a notable weakness in Hallucinations, with only 72.0% accuracy, suggesting it may not always appropriately acknowledge uncertainty. Instruction Following (51.0% accuracy) and Reasoning (64.0% accuracy) also present areas for potential improvement, despite competitive costs. Coding performance is moderate at 80.0% accuracy. Overall, the model is a robust offering, particularly for applications requiring audio processing and general knowledge, while its reliability and speed are significant advantages.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.3

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Mistral
Mistral | mistralai/voxtral-small-24b-2507 32K $0.1 / 1M tokens $0.3 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by mistralai