Inception: Mercury

Text input Text output
Author's Description

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude...

Key Specifications
Cost
$$$$$
Context
128K
Released
Jun 26, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Structured Outputs Response Format Stop Temperature Tool Choice Max Tokens
Features

This model supports the following features:

Structured Outputs Response Format Tools
Performance Summary

Inception: Mercury, the first diffusion large language model (dLLM), demonstrates exceptional speed and competitive pricing, consistently ranking among the fastest and most cost-effective models available. Created on June 26, 2025, with a context length of 128,000, Mercury leverages a breakthrough discrete diffusion approach to achieve 5-10x faster performance than leading speed-optimized models while matching their output quality. The model exhibits strong performance in several key areas. It achieves 98.0% accuracy in Hallucinations (70th percentile), showcasing its ability to appropriately acknowledge uncertainty, and is noted as "Most accurate among models this fast." In General Knowledge, Mercury achieves 97.5% accuracy (46th percentile) and is recognized as the "#1 in speed" and "Speed champion." Its Ethics performance is particularly impressive, reaching 100.0% accuracy and earning accolades for being "Most accurate model at this price point" and "Most accurate among models this fast." Additionally, it performs well in Coding with 82.0% accuracy (44th percentile), again ranking "#1 in speed." However, Mercury shows significant weaknesses in complex cognitive tasks. It scores 0.0% accuracy in both Mathematics and Reasoning, indicating a current limitation in these domains. While its Email Classification (96.0% accuracy, 31st percentile) and Instruction Following (47.0% accuracy, 38th percentile) are moderate, the model's primary strength lies in its speed and efficiency across tasks where it performs adequately, making it ideal for responsive user experiences.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.25
Completion $0.75
Input Cache Read $0.025

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Inception
Inception | inception/mercury 128K $0.25 / 1M tokens $0.75 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by inception