Author's Description
DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.\n\nOther benchmark results include:\n\n- AIME 2024 pass@1: 72.6\n- MATH-500 pass@1: 94.3\n- CodeForces Rating: 1691\n\nThe model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
DeepSeek R1 Distill Qwen 32B, a distilled model based on Qwen 2.5 32B and fine-tuned with DeepSeek R1 outputs, demonstrates a strong performance profile, particularly in reliability and certain cognitive tasks. The model exhibits exceptional reliability with a 99% success rate, indicating consistent and stable operation. However, its speed performance is a notable area for improvement, ranking in the 13th percentile, suggesting longer response times compared to many peers. In terms of cost, it offers generally competitive pricing, falling within the 60th percentile. Analyzing benchmark results, the model shows significant strengths in Coding (89th percentile accuracy) and Reasoning (85th percentile accuracy), achieving high scores in complex problem-solving and programming knowledge. Its Mathematics performance is also robust at the 80th percentile. General Knowledge is strong at 62nd percentile. A key weakness is observed in Hallucinations, where its 60.0% accuracy places it in the 11th percentile, indicating a tendency to generate information rather than acknowledge uncertainty. Instruction Following and Ethics benchmarks show moderate performance, at the 52nd and 36th percentiles respectively. Overall, DeepSeek R1 Distill Qwen 32B excels in tasks requiring logical deduction and structured problem-solving, while its speed and handling of uncertainty present opportunities for enhancement.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.27 |
Completion | $0.27 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
DeepInfra
|
DeepInfra | deepseek/deepseek-r1-distill-qwen-32b | 131K | $0.27 / 1M tokens | $0.27 / 1M tokens |
Novita
|
Novita | deepseek/deepseek-r1-distill-qwen-32b | 64K | $0.27 / 1M tokens | $0.27 / 1M tokens |
GMICloud
|
GMICloud | deepseek/deepseek-r1-distill-qwen-32b | 131K | $0.27 / 1M tokens | $0.27 / 1M tokens |
Cloudflare
|
Cloudflare | deepseek/deepseek-r1-distill-qwen-32b | 80K | $0.5 / 1M tokens | $4.88 / 1M tokens |
Nineteen
|
Nineteen | deepseek/deepseek-r1-distill-qwen-32b | 16K | $0.27 / 1M tokens | $0.27 / 1M tokens |
NextBit
|
NextBit | deepseek/deepseek-r1-distill-qwen-32b | 32K | $0.29 / 1M tokens | $0.29 / 1M tokens |
Novita
|
Novita | deepseek/deepseek-r1-distill-qwen-32b | 64K | $0.3 / 1M tokens | $0.3 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by deepseek
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
DeepSeek: DeepSeek V3.2 Exp | Sep 29, 2025 | — | 131K |
Text input
Text output
|
★★★ | ★★★★★ | $$$ |
DeepSeek: DeepSeek V3.1 Terminus | Sep 22, 2025 | ~671B | 131K |
Text input
Text output
|
★★★★ | ★★★★★ | $$$$ |
DeepSeek: DeepSeek V3.1 | Aug 21, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★ | $$$ |
DeepSeek: DeepSeek V3.1 Base Unavailable | Aug 20, 2025 | ~671B | 163K |
Text input
Text output
|
★ | ★ | $$ |
DeepSeek: R1 Distill Qwen 7B Unavailable | May 30, 2025 | 7B | 131K |
Text input
Text output
|
★ | ★ | $$$$ |
DeepSeek: DeepSeek R1 0528 Qwen3 8B | May 29, 2025 | 8B | 131K |
Text input
Text output
|
★★★ | ★★★ | $$ |
DeepSeek: R1 0528 | May 28, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★ | $$$ |
DeepSeek: DeepSeek Prover V2 | Apr 30, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★ | $$$$ |
DeepSeek: DeepSeek V3 Base Unavailable | Mar 29, 2025 | ~671B | 163K |
Text input
Text output
|
★ | ★ | $$$ |
DeepSeek: DeepSeek V3 0324 | Mar 24, 2025 | ~685B | 163K |
Text input
Text output
|
★★★★ | ★★★★★ | $$ |
DeepSeek: R1 Distill Llama 8B Unavailable | Feb 07, 2025 | 8B | 32K |
Text input
Text output
|
★ | ★★ | $$ |
DeepSeek: R1 Distill Qwen 1.5B Unavailable | Jan 31, 2025 | 5B | 131K |
Text input
Text output
|
★★★ | ★ | $$$ |
DeepSeek: R1 Distill Qwen 14B | Jan 29, 2025 | 14B | 32K |
Text input
Text output
|
★ | ★★ | $$$ |
DeepSeek: R1 Distill Llama 70B | Jan 23, 2025 | 70B | 131K |
Text input
Text output
|
★★★ | ★★★★★ | $$ |
DeepSeek: R1 | Jan 20, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★★ | $$$ |
DeepSeek: DeepSeek V3 | Dec 26, 2024 | — | 163K |
Text input
Text output
|
★★★ | ★★★★ | $$$ |