Inception: Mercury 2

Text input Text output Unavailable
Author's Description

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the [blog post](https://www.inceptionlabs.ai/blog/introducing-mercury-2).

Key Specifications
Cost
$$$
Context
128K
Released
Mar 04, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Temperature Max Tokens Structured Outputs Tool Choice Response Format Reasoning Include Reasoning Stop
Features

This model supports the following features:

Tools Structured Outputs Response Format Reasoning
Performance Summary

Inception: Mercury 2, released on March 4, 2026, is positioned as an extremely fast reasoning LLM, leveraging a novel reasoning diffusion LLM (dLLM) architecture. It consistently performs among the fastest models, ranking in the 95th percentile across eight benchmarks, and offers competitive pricing, placing in the 51st percentile. Notably, Mercury 2 demonstrates exceptional reliability with a 100% success rate across all benchmarks, indicating consistent operational stability. The model exhibits significant strengths in speed-sensitive tasks and reasoning. It achieved 98.0% accuracy in Hallucinations (Baseline) and 96.0% in Reasoning (Baseline), ranking among the most accurate models in its speed class for both, and notably in the top 3 for speed in Reasoning. Its Email Classification accuracy was also strong at 97.0%, again being the most accurate among models of comparable speed. However, Mercury 2 shows notable weaknesses in General Knowledge (8.0% accuracy) and performs below average in Coding (67.0% accuracy) and Mathematics (58.0% accuracy). Instruction Following (53.6% accuracy) and Ethics (94.0% accuracy) also present areas for improvement relative to its overall speed prowess.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.25
Completion $0.75
Input Cache Read $0.025

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Inception
Inception | inception/mercury-2-20260304 128K $0.25 / 1M tokens $0.75 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by inception