Author's Description
DeepSeek R1 Distill Qwen 14B is a distilled large language model based on [Qwen 2.5 14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. Other benchmark results include: - AIME 2024 pass@1: 69.7 - MATH-500 pass@1: 93.9 - CodeForces Rating: 1481 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
DeepSeek R1 Distill Qwen 14B, created on January 29, 2025, demonstrates strong overall performance for a distilled model. It consistently ranks among the fastest models, achieving an Infinityth percentile in speed across 7 benchmarks, and offers competitive pricing, ranking in the 59th percentile across 6 benchmarks. The model exhibits high reliability with a 91% success rate, indicating consistent and usable responses. In terms of specific benchmarks, the model shows exceptional strength in Coding, achieving 93.0% accuracy (91st percentile), and strong Reasoning capabilities with 86.0% accuracy (85th percentile). Its Email Classification also performs well at 93.0% accuracy, though this places it in the 26th percentile. While its Ethics performance is respectable at 87.5% accuracy, it falls into the 23rd percentile. A notable weakness is observed in one instance of Instruction Following, showing 0.0% accuracy, though another run of the same benchmark yielded 44.0% accuracy. General Knowledge also presents a challenge, with 77.5% accuracy placing it in the 24th percentile. Despite some lower percentile rankings, the raw accuracy scores for Ethics, Email Classification, and General Knowledge are still commendable. The model's ability to leverage fine-tuning from DeepSeek R1's outputs allows it to achieve competitive performance comparable to larger frontier models, particularly in specialized areas like AIME and MATH, and competitive CodeForces ratings.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.15 |
Completion | $0.15 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
Novita
|
Novita | deepseek/deepseek-r1-distill-qwen-14b | 64K | $0.15 / 1M tokens | $0.15 / 1M tokens |
GMICloud
|
GMICloud | deepseek/deepseek-r1-distill-qwen-14b | 131K | $0.15 / 1M tokens | $0.15 / 1M tokens |
Together
|
Together | deepseek/deepseek-r1-distill-qwen-14b | 131K | $1.6 / 1M tokens | $1.6 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by deepseek
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
DeepSeek: DeepSeek V3.1 | Aug 21, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★★ | $$$ |
DeepSeek: DeepSeek V3.1 Base | Aug 20, 2025 | ~671B | 163K |
Text input
Text output
|
★★ | ★ | $$ |
DeepSeek: R1 Distill Qwen 7B Unavailable | May 30, 2025 | 7B | 131K |
Text input
Text output
|
★ | ★ | $$$$ |
DeepSeek: Deepseek R1 0528 Qwen3 8B | May 29, 2025 | 8B | 131K |
Text input
Text output
|
★★★ | ★★★ | $$ |
DeepSeek: R1 0528 | May 28, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★ | $$$ |
DeepSeek: DeepSeek Prover V2 | Apr 30, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★★ | $$$$ |
DeepSeek: DeepSeek V3 Base Unavailable | Mar 29, 2025 | ~671B | 163K |
Text input
Text output
|
★ | ★ | $$$ |
DeepSeek: DeepSeek V3 0324 | Mar 24, 2025 | ~685B | 163K |
Text input
Text output
|
★★★★ | ★★★★★ | $$ |
DeepSeek: R1 Distill Llama 8B | Feb 07, 2025 | 8B | 32K |
Text input
Text output
|
★ | ★★ | $$ |
DeepSeek: R1 Distill Qwen 1.5B Unavailable | Jan 31, 2025 | 5B | 131K |
Text input
Text output
|
★★★ | ★ | $$$ |
DeepSeek: R1 Distill Qwen 32B | Jan 29, 2025 | 32B | 131K |
Text input
Text output
|
★ | ★★★★★ | $$$ |
DeepSeek: R1 Distill Llama 70B | Jan 23, 2025 | 70B | 131K |
Text input
Text output
|
★★★ | ★★★★★ | $$ |
DeepSeek: R1 | Jan 20, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★★★ | $$$ |
DeepSeek: DeepSeek V3 | Dec 26, 2024 | — | 163K |
Text input
Text output
|
★★★ | ★★★★★ | $$$ |