Author's Description
Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the [blog post](https://www.inceptionlabs.ai/blog/introducing-mercury-2).
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Inception: Mercury 2, created on March 4, 2026, is positioned as an extremely fast reasoning dLLM, generating and refining multiple tokens in parallel. It consistently ranks among the fastest models, achieving the 95th percentile across 8 benchmarks, and offers competitive pricing at the 51st percentile. The model demonstrates exceptional reliability with a perfect 100% success rate across all benchmarks. Mercury 2 exhibits significant strengths in specific areas. It achieved perfect accuracy in Hallucinations (Baseline), demonstrating robust uncertainty acknowledgment, and was the most accurate model at its price point and speed. It also performed exceptionally well in Email Classification (99.0% accuracy) and Reasoning (96.0% accuracy), ranking highly in speed for both. These results underscore its proficiency in classification and complex problem-solving. However, the model shows notable weaknesses in General Knowledge (7.0% accuracy), Coding (60.0% accuracy), and Mathematics (62.0% accuracy), where its accuracy scores were considerably lower, placing it in the lower percentiles for these categories. Instruction Following (52.5% accuracy) and Ethics (93.0% accuracy) also present areas for potential improvement despite the latter's high absolute score.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.25 |
| Completion | $0.75 |
| Input Cache Read | $0.025 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
Inception
|
Inception | inception/mercury-2-20260304 | 128K | $0.25 / 1M tokens | $0.75 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by inception
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 Unavailable | Mar 04, 2026 | — | 128K |
Text input
Text output
|
★★★★★ | ★★ | $$$ |
| Inception: Mercury | Jun 26, 2025 | — | 128K |
Text input
Text output
|
★★★★★ | ★★ | $$$$$ |
| Inception: Mercury Coder | Apr 30, 2025 | — | 128K |
Text input
Text output
|
★★★★★ | ★★ | $$$ |