Inception: Mercury 2

Text input Text output
Author's Description

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the [blog post](https://www.inceptionlabs.ai/blog/introducing-mercury-2).

Key Specifications
Cost
$$$
Context
128K
Released
Mar 04, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Temperature Max Tokens Structured Outputs Tool Choice Response Format Reasoning Include Reasoning Stop
Features

This model supports the following features:

Tools Structured Outputs Response Format Reasoning
Performance Summary

Inception: Mercury 2, created on March 4, 2026, is positioned as an extremely fast reasoning dLLM, generating and refining multiple tokens in parallel. It consistently ranks among the fastest models, achieving the 95th percentile across 8 benchmarks, and offers competitive pricing at the 51st percentile. The model demonstrates exceptional reliability with a perfect 100% success rate across all benchmarks. Mercury 2 exhibits significant strengths in specific areas. It achieved perfect accuracy in Hallucinations (Baseline), demonstrating robust uncertainty acknowledgment, and was the most accurate model at its price point and speed. It also performed exceptionally well in Email Classification (99.0% accuracy) and Reasoning (96.0% accuracy), ranking highly in speed for both. These results underscore its proficiency in classification and complex problem-solving. However, the model shows notable weaknesses in General Knowledge (7.0% accuracy), Coding (60.0% accuracy), and Mathematics (62.0% accuracy), where its accuracy scores were considerably lower, placing it in the lower percentiles for these categories. Instruction Following (52.5% accuracy) and Ethics (93.0% accuracy) also present areas for potential improvement despite the latter's high absolute score.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.25
Completion $0.75
Input Cache Read $0.025

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Inception
Inception | inception/mercury-2-20260304 128K $0.25 / 1M tokens $0.75 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by inception