DeepSeek: DeepSeek R1 0528 Qwen3 8B

Name: DeepSeek: DeepSeek R1 0528 Qwen3 8B
Brand: deepseek
Availability: OutOfStock
Rating: 2.5 (8 reviews)

Back

Text input Text output Unavailable

Author's Description

DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1 that taps more compute and smarter post-training tricks, pushing its reasoning and inference to the brink of flagship models like O3 and Gemini 2.5 Pro. It now tops math, programming, and logic leaderboards, showcasing a step-change in depth-of-thought. The distilled variant, DeepSeek-R1-0528-Qwen3-8B, transfers this chain-of-thought into an 8 B-parameter form, beating standard Qwen3 8B by +10 pp and tying the 235 B “thinking” giant on AIME 2024.

Key Specifications

Cost

Context

131K

Parameters

Released

May 29, 2025

Speed

★★★

Ability

★★

Reliability

★★★★

Hugging Face

Supported Parameters

This model supports the following parameters:

Stop Include Reasoning Max Tokens Logit Bias Seed Reasoning Top P Min P Frequency Penalty Presence Penalty Temperature

Features

This model supports the following features:

Reasoning

Performance Summary

DeepSeek: Deepseek R1 0528 Qwen3 8B demonstrates moderate speed performance, ranking in the 25th percentile across benchmarks. It offers cost-effective solutions, placing in the 61st percentile for pricing. A standout feature is its exceptional reliability, achieving a 100% success rate with no technical failures. The model exhibits strong performance in several key areas. It achieved perfect accuracy in General Knowledge, notably being the most accurate model at its price point and among models of similar speed. Its Reasoning capabilities are also impressive, scoring 96.0% accuracy and ranking in the 89th percentile. Coding performance is solid at 89.0% accuracy. However, the model shows weaknesses in Hallucinations, with an 80.0% accuracy (24th percentile), indicating a tendency to provide answers rather than acknowledge uncertainty. Instruction Following also presents a mixed picture, with scores of 41.0% and 50.0% across two benchmarks, suggesting room for improvement in handling complex multi-step directives. Mathematics performance is respectable at 86.7%, though its duration for this benchmark is notably high.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0.06
Completion	$0.09

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
Parasail	Parasail \| deepseek/deepseek-r1-0528-qwen3-8b	131K	$0.06 / 1M tokens	$0.09 / 1M tokens
Novita	Novita \| deepseek/deepseek-r1-0528-qwen3-8b	128K	$0.06 / 1M tokens	$0.09 / 1M tokens
Nineteen	Nineteen \| deepseek/deepseek-r1-0528-qwen3-8b	32K	$0.06 / 1M tokens	$0.09 / 1M tokens
Chutes	Chutes \| deepseek/deepseek-r1-0528-qwen3-8b	131K	$0.06 / 1M tokens	$0.09 / 1M tokens
Chutes	Chutes \| deepseek/deepseek-r1-0528-qwen3-8b	32K	$0.06 / 1M tokens	$0.09 / 1M tokens
Novita	Novita \| deepseek/deepseek-r1-0528-qwen3-8b	128K	$0.06 / 1M tokens	$0.09 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by deepseek

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
DeepSeek: DeepSeek V4 Pro Unavailable	Apr 23, 2026	~1.6T	1M	Text input Text output	—	—	$$$$$
DeepSeek: DeepSeek V4 Pro	Apr 23, 2026	~1.6T	1M	Text input Text output	★★	★★★★★	$$$
DeepSeek: DeepSeek V4 Flash Unavailable	Apr 23, 2026	~284B	1M	Text input Text output	—	—	$$
DeepSeek: DeepSeek V4 Flash	Apr 23, 2026	~284B	1M	Text input Text output	★★	★★★★★	$$$
DeepSeek: DeepSeek V3.2 Speciale Unavailable	Dec 01, 2025	—	131K	Text input Text output	★★	★★★★★	$$$$
DeepSeek: DeepSeek V3.2	Dec 01, 2025	—	131K	Text input Text output	—	—	$$$
DeepSeek: DeepSeek V3.2 Exp	Sep 29, 2025	—	131K	Text input Text output	★★★	★★★★	$$$
DeepSeek: DeepSeek V3.1 Terminus	Sep 22, 2025	~671B	131K	Text input Text output	★★★★	★★★★★	$$$
DeepSeek: DeepSeek V3.1 Terminus (exacto) Unavailable	Sep 22, 2025	~671B	131K	Text input Text output	—	—	$$$
DeepSeek: DeepSeek V3.1	Aug 21, 2025	~671B	131K	Text input Text output	★★	★★★★	$$$
DeepSeek: DeepSeek V3.1 Base Unavailable	Aug 20, 2025	~671B	163K	Text input Text output	★★	★	$$
DeepSeek: R1 Distill Qwen 7B Unavailable	May 30, 2025	7B	131K	Text input Text output	★	★	$$$
DeepSeek: R1 0528	May 28, 2025	~671B	128K	Text input Text output	★★★	★★★	$$$
DeepSeek: DeepSeek Prover V2 Unavailable	Apr 30, 2025	~671B	131K	Text input Text output	★★★	★★★★	$$$$
DeepSeek: DeepSeek V3 Base Unavailable	Mar 29, 2025	~671B	163K	Text input Text output	★	★	$$$
DeepSeek: DeepSeek V3 0324	Mar 24, 2025	~685B	163K	Text input Text output	★★★★	★★★★★	$$
DeepSeek: R1 Distill Llama 8B Unavailable	Feb 07, 2025	8B	32K	Text input Text output	★	★★	$$
DeepSeek: R1 Distill Qwen 1.5B Unavailable	Jan 31, 2025	5B	131K	Text input Text output	★★★	★	$$$
DeepSeek: R1 Distill Qwen 32B Unavailable	Jan 29, 2025	32B	131K	Text input Text output	★	★★★★	$$$
DeepSeek: R1 Distill Qwen 14B Unavailable	Jan 29, 2025	14B	32K	Text input Text output	★	★★	$$$
DeepSeek: R1 Distill Llama 70B	Jan 23, 2025	70B	131K	Text input Text output	★★★	★★★★	$$
DeepSeek: R1	Jan 20, 2025	~671B	128K	Text input Text output	★★★★	★★★★	$$$
DeepSeek: DeepSeek V3	Dec 26, 2024	—	163K	Text input Text output	★★★★	★★★★	$$$