DeepSeek: R1 Distill Qwen 32B

Text input Text output
Author's Description

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.\n\nOther benchmark results include:\n\n- AIME 2024 pass@1: 72.6\n- MATH-500 pass@1: 94.3\n- CodeForces Rating: 1691\n\nThe model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

Key Specifications
Cost
$$$
Context
131K
Parameters
32B
Released
Jan 29, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Top P Seed Min P Frequency Penalty Response Format Max Tokens Reasoning Presence Penalty Include Reasoning Temperature
Features

This model supports the following features:

Response Format Reasoning
Performance Summary

DeepSeek: R1 Distill Qwen 32B, a distilled model based on Qwen 2.5 32B and fine-tuned with DeepSeek R1 outputs, demonstrates a compelling performance profile. While it tends to have longer response times, ranking in the 13th percentile for speed, it generally provides cost-effective solutions, placing in the 63rd percentile for price. A standout feature is its exceptional reliability, achieving a perfect 100th percentile, indicating consistent and usable responses with minimal technical failures. In terms of benchmark performance, the model exhibits significant strengths in specific areas. It achieved a remarkable 93.0% accuracy in the Coding (Baseline) benchmark, placing it in the 94th percentile and making it the most accurate model at its price point. Similarly, its 94.0% accuracy in Reasoning (Baseline) positioned it in the 89th percentile, also making it the most accurate model at its price point for this category. General Knowledge (Baseline) also showed strong performance at 98.5% accuracy (68th percentile). However, its performance in Ethics (Baseline) and Email Classification (Baseline) was more moderate, at 97.5% (39th percentile) and 96.0% (42nd percentile) accuracy respectively. Instruction Following (Baseline) was a relative weakness at 51.5% accuracy (55th percentile), coupled with a very long duration. Overall, the model excels in complex problem-solving and programming tasks, offering high accuracy and reliability, though speed remains an area for improvement.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.075
Completion $0.15

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | deepseek/deepseek-r1-distill-qwen-32b 131K $0.075 / 1M tokens $0.15 / 1M tokens
Novita
Novita | deepseek/deepseek-r1-distill-qwen-32b 64K $0.3 / 1M tokens $0.3 / 1M tokens
GMICloud
GMICloud | deepseek/deepseek-r1-distill-qwen-32b 131K $0.075 / 1M tokens $0.15 / 1M tokens
Cloudflare
Cloudflare | deepseek/deepseek-r1-distill-qwen-32b 80K $0.5 / 1M tokens $4.88 / 1M tokens
Nineteen
Nineteen | deepseek/deepseek-r1-distill-qwen-32b 16K $0.075 / 1M tokens $0.15 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by deepseek