Mistral: Devstral Medium

Text input Text output
Author's Description

Devstral Medium is a high-performance code generation and agentic reasoning model developed jointly by Mistral AI and All Hands AI. Positioned as a step up from Devstral Small, it achieves 61.6% on SWE-Bench Verified, placing it ahead of Gemini 2.5 Pro and GPT-4.1 in code-related tasks, at a fraction of the cost. It is designed for generalization across prompt styles and tool use in code agents and frameworks. Devstral Medium is available via API only (not open-weight), and supports enterprise deployment on private infrastructure, with optional fine-tuning capabilities.

Key Specifications
Cost
$$$
Context
131K
Released
Jul 10, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Structured Outputs Tool Choice Response Format Stop Seed Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Features

This model supports the following features:

Tools Response Format Structured Outputs
Performance Summary

Mistral's Devstral Medium is a high-performance code generation and agentic reasoning model, demonstrating strong capabilities across various benchmarks. It performs among the fastest models, ranking in the 69th percentile for speed, and offers competitive pricing, placing in the 51st percentile. Notably, Devstral Medium exhibits exceptional reliability with a 100% success rate across all benchmarks, indicating minimal technical failures. The model excels in several areas, achieving perfect accuracy in Hallucinations, General Knowledge, and Ethics, often being the most accurate at its price point and speed. It also performs perfectly in Email Classification. Its coding prowess is evident with an 88.0% accuracy in the Coding benchmark, and a respectable 90.0% in Mathematics, where it is also noted as the most accurate among models of similar speed. While strong in Instruction Following (65.0% accuracy) and Reasoning (64.0% accuracy), these areas represent relative weaknesses compared to its perfect scores in other categories. Its primary strength lies in its robust code-related performance (61.6% on SWE-Bench Verified), positioning it ahead of competitors like Gemini 2.5 Pro and GPT-4.1.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.4
Completion $2

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Mistral
Mistral | mistralai/devstral-medium-2507 131K $0.4 / 1M tokens $2 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by mistralai