Xiaomi: MiMo-V2-Flash

Text input Text output
Author's Description

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a hybrid-thinking toggle and a 256K context window, and excels at reasoning, coding, and agent scenarios. On SWE-bench Verified and SWE-bench Multilingual, MiMo-V2-Flash ranks as the top #1 open-source model globally, delivering performance comparable to Claude Sonnet 4.5 while costing only about 3.5% as much. Note: when integrating with agentic tools such as Claude Code, Cline, or Roo Code, **turn off reasoning mode** for the best and fastest performance—this model is deeply optimized for this scenario. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config).

Key Specifications
Cost
$$
Context
262K
Parameters
309B (Rumoured)
Released
Dec 14, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Temperature Seed Tools Stop Reasoning Max Tokens Presence Penalty Frequency Penalty Response Format Tool Choice Top P
Features

This model supports the following features:

Response Format Tools Reasoning
Performance Summary

Xiaomi's MiMo-V2-Flash demonstrates competitive response times, performing among the fastest models with a 53rd percentile speed ranking. It offers cost-effective solutions, ranking in the 78th percentile for price. Notably, the model exhibits exceptional reliability with a 100% success rate across all benchmarks, indicating minimal technical failures. In terms of performance across categories, MiMo-V2-Flash excels in Ethics, achieving perfect 100% accuracy, and shows strong capabilities in General Knowledge (98.5%) and Email Classification (98.0%). Its coding performance is solid at 87.0% accuracy. However, a notable weakness is its performance on the Hallucinations benchmark, where it scores 66.0% accuracy, suggesting a tendency to provide information rather than acknowledge uncertainty. Instruction Following and Reasoning benchmarks show moderate performance at 56.0% and 64.0% accuracy, respectively. The model's hybrid-thinking toggle and optimization for agentic tools (with reasoning mode off) are key features, offering performance comparable to Claude Sonnet 4.5 at a significantly lower cost.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.3
Input Cache Read $0.02

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Novita
Novita | xiaomi/mimo-v2-flash-20251210 262K $0.1 / 1M tokens $0.3 / 1M tokens
Chutes
Chutes | xiaomi/mimo-v2-flash-20251210 262K $0.17 / 1M tokens $0.65 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration