Author's Description
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including: - AIME 2024 pass@1: 70.0 - MATH-500 pass@1: 94.5 - CodeForces Rating: 1633 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
DeepSeek R1 Distill Llama 70B demonstrates strong performance across a range of benchmarks, particularly excelling in specialized areas. While its speed ranks among the slower models (6th percentile across benchmarks), its pricing is competitive (47th percentile). The model showcases exceptional capabilities in General Knowledge and Reasoning, achieving 99.8% and 84.0% accuracy respectively, placing it in the 90th and 84th percentiles for these categories. Its Ethics performance is also robust at 99.0% accuracy. In Code Generation, it achieves a commendable 87.0% accuracy, further supported by impressive external benchmarks like a 1633 CodeForces Rating and 94.5% on MATH-500. The model's ability to achieve 70.0% pass@1 on AIME 2024 highlights its advanced problem-solving skills. A notable weakness is its speed, consistently ranking in the lower percentiles for duration across all benchmarks. Despite this, its competitive pricing and high accuracy in complex domains, especially those requiring deep understanding and logical inference, position DeepSeek R1 Distill Llama 70B as a powerful and cost-effective solution for tasks demanding high-quality output, particularly in technical and knowledge-intensive applications.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.05 |
Completion | $0.05 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
DeepInfra
|
DeepInfra | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
InferenceNet
|
InferenceNet | deepseek/deepseek-r1-distill-llama-70b | 128K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Lambda
|
Lambda | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Phala
|
Phala | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
GMICloud
|
GMICloud | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Nebius
|
Nebius | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
SambaNova
|
SambaNova | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Groq
|
Groq | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Novita
|
Novita | deepseek/deepseek-r1-distill-llama-70b | 32K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Together
|
Together | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Cerebras
|
Cerebras | deepseek/deepseek-r1-distill-llama-70b | 32K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Chutes
|
Chutes | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.05 / 1M tokens | $0.05 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by deepseek
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
DeepSeek: R1 Distill Qwen 7B | May 30, 2025 | 7B | 131K |
Text input
Text output
|
★ | ★ | $$$$ |
DeepSeek: Deepseek R1 0528 Qwen3 8B | May 29, 2025 | 8B | 131K |
Text input
Text output
|
★ | ★★★★★ | $$$ |
DeepSeek: R1 0528 | May 28, 2025 | ~671B | 128K |
Text input
Text output
|
★ | ★★★★★ | $$$$$ |
DeepSeek: DeepSeek Prover V2 | Apr 30, 2025 | ~671B | 131K |
Text input
Text output
|
★★★★ | ★★★★★ | $$$$ |
DeepSeek: DeepSeek V3 0324 | Mar 24, 2025 | ~685B | 163K |
Text input
Text output
|
★★★ | ★★★★★ | $$$ |
DeepSeek: R1 Distill Llama 8B | Feb 07, 2025 | 8B | 32K |
Text input
Text output
|
★ | ★★★ | $$ |
DeepSeek: R1 Distill Qwen 1.5B | Jan 31, 2025 | 5B | 131K |
Text input
Text output
|
★★★ | ★ | $$$ |
DeepSeek: R1 Distill Qwen 32B | Jan 29, 2025 | 32B | 131K |
Text input
Text output
|
★ | ★★★★★ | $$$ |
DeepSeek: R1 Distill Qwen 14B | Jan 29, 2025 | 14B | 64K |
Text input
Text output
|
★ | ★★★ | $$$ |
DeepSeek: R1 | Jan 20, 2025 | ~671B | 128K |
Text input
Text output
|
★★ | ★★★★ | $$$$ |
DeepSeek: DeepSeek V3 | Dec 26, 2024 | — | 163K |
Text input
Text output
|
★★★ | ★★★★ | $$$ |