Author's Description
DeepSeek R1 Distill Llama 8B is a distilled large language model based on [Llama-3.1-8B-Instruct](/meta-llama/llama-3.1-8b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including: - AIME 2024 pass@1: 50.4 - MATH-500 pass@1: 89.1 - CodeForces Rating: 1205 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models. Hugging Face: - [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) - [DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) |
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
DeepSeek R1 Distill Llama 8B, created on February 7, 2025, is a distilled large language model based on Llama-3.1-8B-Instruct, leveraging outputs from DeepSeek R1. This model consistently offers among the most competitive pricing, ranking in the 81st percentile across benchmarks. However, it tends to have longer response times, placing it in the 9th percentile for speed. The model demonstrates strong reliability, with an 88th percentile ranking, indicating few technical issues and consistent provision of usable responses. In terms of performance across categories, the model shows a mixed profile. It achieves strong accuracy in Coding (81.0%, 54th percentile) and Reasoning (66.0%, 61st percentile), suggesting a solid capability in these areas. Its performance in Ethics (74.0% accuracy, 21st percentile), Instruction Following (25.3% accuracy, 25th percentile), Email Classification (92.0% accuracy, 22nd percentile), and General Knowledge (72.8% accuracy, 24th percentile) is notably lower, indicating areas for potential improvement. Despite these lower percentile rankings, the model's high accuracy in Email Classification (92.0%) is a notable strength, even if its relative ranking is lower. Its key strengths lie in its cost-effectiveness and reliability, coupled with competitive performance in coding and reasoning tasks. The primary weakness is its slower processing speed and lower relative accuracy in instruction following, ethics, and general knowledge benchmarks.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.04 |
Completion | $0.04 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
Novita
|
Novita | deepseek/deepseek-r1-distill-llama-8b | 32K | $0.04 / 1M tokens | $0.04 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by deepseek
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
DeepSeek: DeepSeek V3.1 | Aug 21, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★★ | $$$ |
DeepSeek: DeepSeek V3.1 Base | Aug 20, 2025 | ~671B | 163K |
Text input
Text output
|
★★ | ★ | $$ |
DeepSeek: R1 Distill Qwen 7B Unavailable | May 30, 2025 | 7B | 131K |
Text input
Text output
|
★ | ★ | $$$$ |
DeepSeek: Deepseek R1 0528 Qwen3 8B | May 29, 2025 | 8B | 131K |
Text input
Text output
|
★★★ | ★★★ | $$ |
DeepSeek: R1 0528 | May 28, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★ | $$$ |
DeepSeek: DeepSeek Prover V2 | Apr 30, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★★ | $$$$ |
DeepSeek: DeepSeek V3 Base Unavailable | Mar 29, 2025 | ~671B | 163K |
Text input
Text output
|
★ | ★ | $$$ |
DeepSeek: DeepSeek V3 0324 | Mar 24, 2025 | ~685B | 163K |
Text input
Text output
|
★★★★ | ★★★★★ | $$ |
DeepSeek: R1 Distill Qwen 1.5B Unavailable | Jan 31, 2025 | 5B | 131K |
Text input
Text output
|
★★★ | ★ | $$$ |
DeepSeek: R1 Distill Qwen 32B | Jan 29, 2025 | 32B | 131K |
Text input
Text output
|
★ | ★★★★★ | $$$ |
DeepSeek: R1 Distill Qwen 14B | Jan 29, 2025 | 14B | 64K |
Text input
Text output
|
★ | ★★ | $$$ |
DeepSeek: R1 Distill Llama 70B | Jan 23, 2025 | 70B | 131K |
Text input
Text output
|
★★★ | ★★★★★ | $$ |
DeepSeek: R1 | Jan 20, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★★★ | $$$ |
DeepSeek: DeepSeek V3 | Dec 26, 2024 | — | 163K |
Text input
Text output
|
★★★ | ★★★★★ | $$$ |