Author's Description
DeepSeek R1 Distill Qwen 1.5B is a distilled large language model based on [Qwen 2.5 Math 1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It's a very small and efficient model which outperforms [GPT 4o 0513](/openai/gpt-4o-2024-05-13) on Math Benchmarks. Other benchmark results include: - AIME 2024 pass@1: 28.9 - AIME 2024 cons@64: 52.7 - MATH-500 pass@1: 83.9 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
DeepSeek R1 Distill Qwen 1.5B, a distilled model based on Qwen 2.5 Math 1.5B and fine-tuned with DeepSeek R1 outputs, demonstrates a moderate speed performance, ranking in the 39th percentile across five benchmarks. It offers competitive pricing, placing in the 52nd percentile. Despite its small size, the model exhibits exceptional performance on Math Benchmarks, notably outperforming GPT-4o 0513. Specific math benchmark results are impressive: AIME 2024 pass@1 at 28.9, AIME 2024 cons@64 at 52.7, and MATH-500 pass@1 at 83.9. This highlights a significant strength in mathematical reasoning and problem-solving. However, its performance across general benchmarks is considerably lower. It struggles with Instruction Following (12.1% accuracy, 24th percentile), Coding (21.0% accuracy, 17th percentile), General Knowledge (45.1% accuracy, 16th percentile), Email Classification (29.0% accuracy, 5th percentile), and Ethics (34.0% accuracy, 13th percentile). These results indicate a notable weakness in broader cognitive tasks and general domain understanding. The model's primary strength lies in its specialized mathematical capabilities, achieved with remarkable efficiency for its size.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.18 |
| Completion | $0.18 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
Together
|
Together | deepseek/deepseek-r1-distill-qwen-1.5b | 131K | $0.18 / 1M tokens | $0.18 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by deepseek
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| DeepSeek: DeepSeek V3.2 Speciale | Dec 01, 2025 | — | 131K |
Text input
Text output
|
★ | ★★★★★ | $$$$ |
| DeepSeek: DeepSeek V3.2 | Dec 01, 2025 | — | 131K |
Text input
Text output
|
— | — | $$$ |
| DeepSeek: DeepSeek V3.2 Exp | Sep 29, 2025 | — | 131K |
Text input
Text output
|
★★★ | ★★★★★ | $$$ |
| DeepSeek: DeepSeek V3.1 Terminus | Sep 22, 2025 | ~671B | 131K |
Text input
Text output
|
★★★★ | ★★★★★ | $$$$ |
| DeepSeek: DeepSeek V3.1 Terminus (exacto) | Sep 22, 2025 | ~671B | 131K |
Text input
Text output
|
— | — | $$$ |
| DeepSeek: DeepSeek V3.1 | Aug 21, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★ | $$$ |
| DeepSeek: DeepSeek V3.1 Base Unavailable | Aug 20, 2025 | ~671B | 163K |
Text input
Text output
|
★ | ★ | $$ |
| DeepSeek: R1 Distill Qwen 7B Unavailable | May 30, 2025 | 7B | 131K |
Text input
Text output
|
★ | ★ | $$$$ |
| DeepSeek: DeepSeek R1 0528 Qwen3 8B | May 29, 2025 | 8B | 131K |
Text input
Text output
|
★★★ | ★★★ | $$ |
| DeepSeek: R1 0528 | May 28, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★ | $$$ |
| DeepSeek: DeepSeek Prover V2 | Apr 30, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★ | $$$$ |
| DeepSeek: DeepSeek V3 Base Unavailable | Mar 29, 2025 | ~671B | 163K |
Text input
Text output
|
★ | ★ | $$$ |
| DeepSeek: DeepSeek V3 0324 | Mar 24, 2025 | ~685B | 163K |
Text input
Text output
|
★★★★ | ★★★★★ | $$ |
| DeepSeek: R1 Distill Llama 8B Unavailable | Feb 07, 2025 | 8B | 32K |
Text input
Text output
|
★ | ★★ | $$ |
| DeepSeek: R1 Distill Qwen 32B | Jan 29, 2025 | 32B | 131K |
Text input
Text output
|
★ | ★★★★ | $$$ |
| DeepSeek: R1 Distill Qwen 14B | Jan 29, 2025 | 14B | 32K |
Text input
Text output
|
★ | ★★ | $$$ |
| DeepSeek: R1 Distill Llama 70B | Jan 23, 2025 | 70B | 131K |
Text input
Text output
|
★★★ | ★★★★★ | $$ |
| DeepSeek: R1 | Jan 20, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★★ | $$$ |
| DeepSeek: DeepSeek V3 | Dec 26, 2024 | — | 163K |
Text input
Text output
|
★★★ | ★★★★ | $$$ |