Author's Description
Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the [blog post](https://www.inceptionlabs.ai/blog/introducing-mercury-2).
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Inception: Mercury 2, released on March 4, 2026, is positioned as an extremely fast reasoning LLM, leveraging a novel reasoning diffusion LLM (dLLM) architecture. It consistently performs among the fastest models, ranking in the 95th percentile across eight benchmarks, and offers competitive pricing, placing in the 51st percentile. Notably, Mercury 2 demonstrates exceptional reliability with a 100% success rate across all benchmarks, indicating consistent operational stability. The model exhibits significant strengths in speed-sensitive tasks and reasoning. It achieved 98.0% accuracy in Hallucinations (Baseline) and 96.0% in Reasoning (Baseline), ranking among the most accurate models in its speed class for both, and notably in the top 3 for speed in Reasoning. Its Email Classification accuracy was also strong at 97.0%, again being the most accurate among models of comparable speed. However, Mercury 2 shows notable weaknesses in General Knowledge (8.0% accuracy) and performs below average in Coding (67.0% accuracy) and Mathematics (58.0% accuracy). Instruction Following (53.6% accuracy) and Ethics (94.0% accuracy) also present areas for improvement relative to its overall speed prowess.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.25 |
| Completion | $0.75 |
| Input Cache Read | $0.025 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
Inception
|
Inception | inception/mercury-2-20260304 | 128K | $0.25 / 1M tokens | $0.75 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by inception
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| Inception: Mercury 2 | Mar 04, 2026 | — | 128K |
Text input
Text output
|
★★★★★ | ★★ | $$$ |
| Inception: Mercury | Jun 26, 2025 | — | 128K |
Text input
Text output
|
★★★★★ | ★★ | $$$$$ |
| Inception: Mercury Coder | Apr 30, 2025 | — | 128K |
Text input
Text output
|
★★★★★ | ★★ | $$$ |