Author's Description
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across...
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
DeepSeek R1 Distill Llama 70B, created on January 23, 2025, is a distilled large language model with a context length of 131072, leveraging outputs from DeepSeek R1 and based on Llama-3.3-70B-Instruct. The model exhibits moderate speed performance, ranking in the 20th percentile across benchmarks, and offers competitive pricing, placing it in the 51st percentile. Notably, its reliability is exceptional, boasting a 97% success rate across 9 benchmarks, indicating minimal technical failures. In terms of performance, the model demonstrates strong capabilities in General Knowledge (99.8% accuracy, 80th percentile) and Reasoning (84.0% accuracy, 66th percentile). It also achieves perfect accuracy in one instance of Instruction Following, showcasing its ability to precisely adhere to complex directives. Its Coding performance is solid at 87.0% accuracy (55th percentile), and it maintains a high standard in Ethics (99.0% accuracy, 54th percentile). A key strength is its impressive performance on specialized benchmarks like MATH-500 (94.5% pass@1) and AIME 2024 (70.0% pass@1), alongside a CodeForces Rating of 1633, highlighting its mathematical and algorithmic prowess. A notable weakness is its Hallucinations (Baseline) score of 90.0% accuracy, which, while decent, suggests room for improvement in acknowledging uncertainty. Its Mathematics (Baseline) score of 79.0% (34th percentile) also indicates it's not a top performer in general mathematical tests.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.7 |
| Completion | $0.8 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
DeepInfra
|
DeepInfra | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
InferenceNet
|
InferenceNet | deepseek/deepseek-r1-distill-llama-70b | 128K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Lambda
|
Lambda | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Phala
|
Phala | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
GMICloud
|
GMICloud | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Nebius
|
Nebius | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
SambaNova
|
SambaNova | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Groq
|
Groq | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Novita
|
Novita | deepseek/deepseek-r1-distill-llama-70b | 32K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Together
|
Together | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Cerebras
|
Cerebras | deepseek/deepseek-r1-distill-llama-70b | 32K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Chutes
|
Chutes | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Chutes
|
Chutes | deepseek/deepseek-r1-distill-llama-70b | 131K | $0.7 / 1M tokens | $0.8 / 1M tokens |
|
Novita
|
Novita | deepseek/deepseek-r1-distill-llama-70b | 8K | $0.8 / 1M tokens | $0.8 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by deepseek
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| DeepSeek: DeepSeek V3.2 Speciale | Dec 01, 2025 | — | 131K |
Text input
Text output
|
★★ | ★★★★★ | $$$$ |
| DeepSeek: DeepSeek V3.2 | Dec 01, 2025 | — | 131K |
Text input
Text output
|
— | — | $$$ |
| DeepSeek: DeepSeek V3.2 Exp | Sep 29, 2025 | — | 131K |
Text input
Text output
|
★★★ | ★★★★ | $$$ |
| DeepSeek: DeepSeek V3.1 Terminus | Sep 22, 2025 | ~671B | 131K |
Text input
Text output
|
★★★★ | ★★★★★ | $$$$ |
| DeepSeek: DeepSeek V3.1 Terminus (exacto) Unavailable | Sep 22, 2025 | ~671B | 131K |
Text input
Text output
|
— | — | $$$ |
| DeepSeek: DeepSeek V3.1 | Aug 21, 2025 | ~671B | 131K |
Text input
Text output
|
★★ | ★★★★ | $$$ |
| DeepSeek: DeepSeek V3.1 Base Unavailable | Aug 20, 2025 | ~671B | 163K |
Text input
Text output
|
★★ | ★ | $$ |
| DeepSeek: R1 Distill Qwen 7B Unavailable | May 30, 2025 | 7B | 131K |
Text input
Text output
|
★ | ★ | $$$ |
| DeepSeek: DeepSeek R1 0528 Qwen3 8B Unavailable | May 29, 2025 | 8B | 131K |
Text input
Text output
|
★★★ | ★★★ | $$ |
| DeepSeek: R1 0528 | May 28, 2025 | ~671B | 128K |
Text input
Text output
|
★★★ | ★★★ | $$$ |
| DeepSeek: DeepSeek Prover V2 Unavailable | Apr 30, 2025 | ~671B | 131K |
Text input
Text output
|
★★★ | ★★★★ | $$$$ |
| DeepSeek: DeepSeek V3 Base Unavailable | Mar 29, 2025 | ~671B | 163K |
Text input
Text output
|
★ | ★ | $$$ |
| DeepSeek: DeepSeek V3 0324 | Mar 24, 2025 | ~685B | 163K |
Text input
Text output
|
★★★★ | ★★★★★ | $$ |
| DeepSeek: R1 Distill Llama 8B Unavailable | Feb 07, 2025 | 8B | 32K |
Text input
Text output
|
★ | ★★ | $$ |
| DeepSeek: R1 Distill Qwen 1.5B Unavailable | Jan 31, 2025 | 5B | 131K |
Text input
Text output
|
★★★ | ★ | $$$ |
| DeepSeek: R1 Distill Qwen 32B | Jan 29, 2025 | 32B | 131K |
Text input
Text output
|
★ | ★★★★ | $$$ |
| DeepSeek: R1 Distill Qwen 14B Unavailable | Jan 29, 2025 | 14B | 32K |
Text input
Text output
|
★ | ★★ | $$$ |
| DeepSeek: R1 | Jan 20, 2025 | ~671B | 128K |
Text input
Text output
|
★★★★ | ★★★★ | $$$ |
| DeepSeek: DeepSeek V3 | Dec 26, 2024 | — | 163K |
Text input
Text output
|
★★★★ | ★★★★ | $$$ |