DeepSeek: R1 Distill Qwen 7B

Name: DeepSeek: R1 Distill Qwen 7B
Brand: deepseek
Availability: OutOfStock
Rating: 1.4 (5 reviews)

Back

Text input Text output Unavailable

Author's Description

DeepSeek-R1-Distill-Qwen-7B is a 7 billion parameter dense language model distilled from DeepSeek-R1, leveraging reinforcement learning-enhanced reasoning data generated by DeepSeek's larger models. The distillation process transfers advanced reasoning, math, and code capabilities into a smaller, more efficient model architecture based on Qwen2.5-Math-7B. This model demonstrates strong performance across mathematical benchmarks (92.8% pass@1 on MATH-500), coding tasks (Codeforces rating 1189), and general reasoning (49.1% pass@1 on GPQA Diamond), achieving competitive accuracy relative to larger models while maintaining smaller inference costs.

Key Specifications

Cost

$$$$

Context

131K

Parameters

Released

May 30, 2025

Speed

★

Ability

★

Reliability

★

Hugging Face

Supported Parameters

This model supports the following parameters:

Include Reasoning Max Tokens Top P Reasoning Seed Temperature

Features

This model supports the following features:

Reasoning

Performance Summary

DeepSeek: R1 Distill Qwen 7B, a 7 billion parameter model distilled from DeepSeek-R1, consistently ranks among the fastest models and offers highly competitive pricing across all evaluated benchmarks. This model, created on May 30, 2025, leverages reinforcement learning-enhanced reasoning data to transfer advanced capabilities into a smaller architecture. In terms of performance, the model demonstrates a notable strength in Instruction Following, achieving 34.3% accuracy, placing it in the 35th percentile. Its Coding performance is also respectable at 66.0% accuracy (27th percentile). However, a significant weakness is apparent in its performance across General Knowledge, Email Classification, and Ethics benchmarks, where it recorded 0.0% accuracy. This suggests a potential limitation in handling certain types of open-ended knowledge retrieval or nuanced classification tasks, or possibly an issue with the evaluation methodology for these specific categories. Despite these areas for improvement, its strong showing in mathematical benchmarks (92.8% pass@1 on MATH-500) and coding tasks (Codeforces rating 1189), as mentioned in its description, indicates a specialized proficiency in these domains. The model's efficiency in terms of speed and cost makes it an attractive option for applications where these factors are critical, particularly for tasks requiring strong instruction following and coding capabilities.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0.1
Completion	$0.2

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
GMICloud	GMICloud \| deepseek/deepseek-r1-distill-qwen-7b	131K	$0.1 / 1M tokens	$0.2 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by deepseek

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
DeepSeek: DeepSeek V3.2 Exp	Sep 29, 2025	—	131K	Text input Text output	★★★	★★★★★	$$$
DeepSeek: DeepSeek V3.1 Terminus	Sep 22, 2025	~671B	131K	Text input Text output	★★★★	★★★★★	$$$$
DeepSeek: DeepSeek V3.1	Aug 21, 2025	~671B	131K	Text input Text output	★★	★★★★	$$$
DeepSeek: DeepSeek V3.1 Base Unavailable	Aug 20, 2025	~671B	163K	Text input Text output	★	★	$$
DeepSeek: DeepSeek R1 0528 Qwen3 8B	May 29, 2025	8B	131K	Text input Text output	★★★	★★★	$$
DeepSeek: R1 0528	May 28, 2025	~671B	128K	Text input Text output	★★★	★★★	$$$
DeepSeek: DeepSeek Prover V2	Apr 30, 2025	~671B	131K	Text input Text output	★★	★★★★	$$$$
DeepSeek: DeepSeek V3 Base Unavailable	Mar 29, 2025	~671B	163K	Text input Text output	★	★	$$$
DeepSeek: DeepSeek V3 0324	Mar 24, 2025	~685B	163K	Text input Text output	★★★★	★★★★★	$$
DeepSeek: R1 Distill Llama 8B Unavailable	Feb 07, 2025	8B	32K	Text input Text output	★	★★	$$
DeepSeek: R1 Distill Qwen 1.5B Unavailable	Jan 31, 2025	5B	131K	Text input Text output	★★★	★	$$$
DeepSeek: R1 Distill Qwen 32B	Jan 29, 2025	32B	131K	Text input Text output	★	★★★★	$$$
DeepSeek: R1 Distill Qwen 14B	Jan 29, 2025	14B	32K	Text input Text output	★	★★	$$$
DeepSeek: R1 Distill Llama 70B	Jan 23, 2025	70B	131K	Text input Text output	★★★	★★★★★	$$
DeepSeek: R1	Jan 20, 2025	~671B	128K	Text input Text output	★★★	★★★★	$$$
DeepSeek: DeepSeek V3	Dec 26, 2024	—	163K	Text input Text output	★★★	★★★★	$$$