DeepSeek: Deepseek R1 0528 Qwen3 8B

Text input Text output Free Option
Author's Description

DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1 that taps more compute and smarter post-training tricks, pushing its reasoning and inference to the brink of flagship models like O3 and Gemini 2.5 Pro. It now tops math, programming, and logic leaderboards, showcasing a step-change in depth-of-thought. The distilled variant, DeepSeek-R1-0528-Qwen3-8B, transfers this chain-of-thought into an 8 B-parameter form, beating standard Qwen3 8B by +10 pp and tying the 235 B “thinking” giant on AIME 2024.

Key Specifications
Cost
$$
Context
131K
Parameters
8B
Released
May 29, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Top P Min P Seed Frequency Penalty Max Tokens Reasoning Presence Penalty Include Reasoning Logit Bias Temperature
Features

This model supports the following features:

Reasoning
Performance Summary

DeepSeek: Deepseek R1 0528 Qwen3 8B demonstrates moderate speed performance, ranking in the 31st percentile, indicating it is not among the fastest models available. However, it offers generally cost-effective solutions, placing in the 63rd percentile for price competitiveness. A standout feature is its exceptional reliability, achieving a perfect 100% success rate across all benchmarks, signifying consistent and dependable operation. In terms of benchmark performance, the model exhibits a mixed profile. It shows strong capabilities in Reasoning (98% accuracy, 96th percentile), where it is noted as the most accurate model at its price point. It also achieves perfect accuracy in General Knowledge (100%), ranking among the top 3 and being the most accurate among models of comparable speed and price. Coding performance is solid at 89% accuracy (75th percentile). However, its Instruction Following capabilities are less consistent, with accuracies of 50% and 41%, and notably high durations, indicating a potential area for improvement in processing complex instructions efficiently. Ethics and Email Classification show moderate accuracy at 98% and 95% respectively, though their percentile rankings are lower. Overall, its key strengths lie in complex reasoning and general knowledge, while speed and nuanced instruction following present opportunities for enhancement.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.05
Completion $0.1

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Parasail
Parasail | deepseek/deepseek-r1-0528-qwen3-8b 131K $0.05 / 1M tokens $0.1 / 1M tokens
Novita
Novita | deepseek/deepseek-r1-0528-qwen3-8b 128K $0.06 / 1M tokens $0.09 / 1M tokens
Nineteen
Nineteen | deepseek/deepseek-r1-0528-qwen3-8b 32K $0.017 / 1M tokens $0.0682 / 1M tokens
Chutes
Chutes | deepseek/deepseek-r1-0528-qwen3-8b 131K $0.017 / 1M tokens $0.0682 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by deepseek