Author's Description
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities. 256K context window.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
The Xiaomi MiMo-V2-Omni model demonstrates moderate speed performance, ranking in the 35th percentile across various benchmarks. Its pricing tends to be at premium levels, positioned in the 14th percentile. A standout feature is its exceptional reliability, achieving a 100% success rate across all evaluated benchmarks, indicating consistent and stable operation. In terms of performance across categories, MiMo-V2-Omni exhibits strong capabilities in several areas. It achieves perfect accuracy in both General Knowledge and Ethics, with the former also being highlighted as the most accurate model at its price point and speed. Its Reasoning and Mathematics scores are particularly impressive, ranking in the 90th and 97th percentiles respectively, showcasing advanced problem-solving and quantitative skills. Coding also shows strong performance at 92% accuracy. While its Hallucinations benchmark is respectable at 94% accuracy, it falls in the 47th percentile, suggesting some room for improvement in acknowledging uncertainty. Instruction Following, at 69% accuracy, is a relative weakness, placing it in the 73rd percentile but indicating challenges with complex multi-step directives. Email Classification is solid at 98% accuracy. Overall, the model's key strengths lie in its high accuracy in knowledge-based tasks, complex reasoning, and mathematical problem-solving, coupled with robust reliability. Its primary area for development appears to be in handling highly complex, multi-layered instructions.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.4 |
| Completion | $2 |
| Input Cache Read | $0.08 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
Xiaomi
|
Xiaomi | xiaomi/mimo-v2-omni-20260318 | 262K | $0.4 / 1M tokens | $2 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by xiaomi
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| Xiaomi: MiMo-V2-Pro | Mar 18, 2026 | ~1T | 1M |
Text input
Text output
|
★ | ★★★★★ | $$$$$ |
| Xiaomi: MiMo-V2-Flash | Dec 14, 2025 | ~309B | 262K |
Text input
Text output
|
★★★★ | ★★★ | $$ |