Author's Description
Kimi Linear is a hybrid linear attention architecture that outperforms traditional full attention methods across various contexts, including short, long, and reinforcement learning (RL) scaling regimes. At its core is Kimi Delta Attention (KDA)—a refined version of Gated DeltaNet that introduces a more efficient gating mechanism to optimize the use of finite-state RNN memory. Kimi Linear achieves superior performance and hardware efficiency, especially for long-context tasks. It reduces the need for large KV caches by up to 75% and boosts decoding throughput by up to 6x for contexts as long as 1M tokens.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
MoonshotAI's Kimi Linear 48B A3B Instruct, featuring the innovative Kimi Delta Attention architecture, demonstrates strong performance in specific areas, particularly excelling in hardware efficiency for long-context tasks. The model performs among the fastest, ranking in the 77th percentile for speed, and offers competitive pricing, placing in the 64th percentile. Notably, it exhibits exceptional reliability with a 99% success rate, indicating consistent operational stability. Analysis of benchmark results reveals a mixed performance profile. Kimi Linear shows a significant strength in Email Classification, achieving 98.0% accuracy, placing it in the 65th percentile. Its core architectural benefits are evident in its ability to reduce KV cache needs by up to 75% and boost decoding throughput by up to 6x for 1M token contexts. However, the model struggles with tasks requiring deep understanding and complex reasoning, such as General Knowledge (28.0% accuracy), Ethics (46.0% accuracy), and Hallucinations (68.0% accuracy, indicating a tendency to hallucinate). While its Mathematics and Reasoning scores are moderate, Instruction Following and Coding also present areas for improvement. Overall, Kimi Linear is a highly reliable and efficient model for specific classification tasks and long-context processing, but its general cognitive abilities require further development.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.3 |
| Completion | $0.6 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
Parasail
|
Parasail | moonshotai/kimi-linear-48b-a3b-instruct-20251029 | 1M | $0.3 / 1M tokens | $0.6 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by moonshotai
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| MoonshotAI: Kimi K2 Thinking | Nov 06, 2025 | ~1T | 262K |
Text input
Text output
|
★ | ★★★★★ | $$$$$ |
| MoonshotAI: Kimi K2 0905 | Sep 04, 2025 | ~32B | 262K |
Text input
Text output
|
★★ | ★★★ | $$$$ |
| MoonshotAI: Kimi K2 0905 (exacto) | Sep 04, 2025 | ~1T | 262K |
Text input
Text output
|
— | — | $$$$$ |
| MoonshotAI: Kimi K2 0711 | Jul 11, 2025 | ~1T | 131K |
Text input
Text output
|
★★★★ | ★★★★★ | $$ |
| MoonshotAI: Kimi Dev 72B | Jun 16, 2025 | 72B | 131K |
Text input
Text output
|
★ | ★★ | $$$$ |
| MoonshotAI: Kimi VL A3B Thinking Unavailable | Apr 10, 2025 | 3B | 131K |
Image input
Text input
Text output
|
★ | ★ | $$$ |