Inception: Mercury

Text input Text output
Author's Description

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the [blog post] (https://www.inceptionlabs.ai/blog/introducing-mercury) here.

Key Specifications
Cost
$$$$$
Context
128K
Released
Jun 26, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Presence Penalty Stop Max Tokens Frequency Penalty Tool Choice Top P Tools Structured Outputs Temperature Response Format
Features

This model supports the following features:

Response Format Tools Structured Outputs
Performance Summary

Inception: Mercury, the first diffusion large language model (dLLM), demonstrates exceptional speed and competitive pricing. It consistently ranks among the fastest models, achieving Infinityth percentile across 8 benchmarks, and offers among the most competitive pricing, also at the Infinityth percentile across 8 benchmarks. Mercury exhibits strong performance in several key areas. It achieves near-perfect accuracy in Ethics (100%), ranking in the top 3 for speed and being the most accurate model at its price point. In Hallucinations (Baseline), it shows high accuracy (98.0%), placing in the top 3 for speed and being the most accurate among models of comparable speed. General Knowledge (Baseline) also highlights its speed, achieving 97.5% accuracy and ranking #1 in speed. Coding (Baseline) is another strength, with 82.0% accuracy and a #1 speed ranking. However, the model shows significant weaknesses in Mathematics (Baseline) and Reasoning (Baseline), where it scored 0.0% accuracy in both. Email Classification (Baseline) and Instruction Following (Baseline) show moderate performance at 96.0% and 47.0% accuracy respectively, with the former being noted for its speed. Overall, Mercury excels in speed-sensitive applications and ethical considerations, but its current capabilities in complex mathematical and reasoning tasks are limited.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.25
Completion $1

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Inception
Inception | inception/mercury 128K $0.25 / 1M tokens $1 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by inception