DeepSeek: R1 Distill Llama 70B

Text input Text output Free Option
Author's Description

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including: - AIME 2024 pass@1: 70.0 - MATH-500 pass@1: 94.5 - CodeForces Rating: 1633 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

Key Specifications
Cost
$$
Context
131K
Parameters
70B
Released
Jan 23, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Response Format Stop Max Tokens Top P Frequency Penalty Reasoning Min P Seed Temperature Presence Penalty
Features

This model supports the following features:

Reasoning Response Format
Performance Summary

DeepSeek R1 Distill Llama 70B demonstrates a strong performance profile, particularly in its reliability and specific academic benchmarks. While its speed ranking places it in the 18th percentile, indicating generally longer response times, it offers competitive pricing, ranking in the 47th percentile. A standout feature is its exceptional reliability, achieving a 97% success rate across benchmarks, suggesting consistent and usable outputs. The model excels in specialized areas, achieving a remarkable 70.0 pass@1 on AIME 2024 and 94.5 pass@1 on MATH-500, alongside a CodeForces Rating of 1633. Benchmark results show high accuracy in General Knowledge (99.8%) and Reasoning (84.0%). It also performs well in Ethics (99.0%) and Coding (87.0%). A notable strength is its perfect 100.0% accuracy in one Instruction Following benchmark, achieved with a fast duration. However, its Hallucinations accuracy (90.0%) is moderate, and its Mathematics benchmark score (79.0%) is in the lower half of models. The model's primary weakness appears to be its speed, consistently ranking in lower percentiles for duration across most benchmarks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.5
Completion $1

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | deepseek/deepseek-r1-distill-llama-70b 131K $0.5 / 1M tokens $1 / 1M tokens
InferenceNet
InferenceNet | deepseek/deepseek-r1-distill-llama-70b 128K $0.03 / 1M tokens $0.13 / 1M tokens
Lambda
Lambda | deepseek/deepseek-r1-distill-llama-70b 131K $0.03 / 1M tokens $0.13 / 1M tokens
Phala
Phala | deepseek/deepseek-r1-distill-llama-70b 131K $0.03 / 1M tokens $0.13 / 1M tokens
GMICloud
GMICloud | deepseek/deepseek-r1-distill-llama-70b 131K $0.03 / 1M tokens $0.13 / 1M tokens
Nebius
Nebius | deepseek/deepseek-r1-distill-llama-70b 131K $0.03 / 1M tokens $0.13 / 1M tokens
SambaNova
SambaNova | deepseek/deepseek-r1-distill-llama-70b 131K $0.7 / 1M tokens $1.4 / 1M tokens
Groq
Groq | deepseek/deepseek-r1-distill-llama-70b 131K $0.03 / 1M tokens $0.13 / 1M tokens
Novita
Novita | deepseek/deepseek-r1-distill-llama-70b 32K $0.03 / 1M tokens $0.13 / 1M tokens
Together
Together | deepseek/deepseek-r1-distill-llama-70b 131K $2 / 1M tokens $2 / 1M tokens
Cerebras
Cerebras | deepseek/deepseek-r1-distill-llama-70b 32K $0.03 / 1M tokens $0.13 / 1M tokens
Chutes
Chutes | deepseek/deepseek-r1-distill-llama-70b 131K $0.03 / 1M tokens $0.13 / 1M tokens
Chutes
Chutes | deepseek/deepseek-r1-distill-llama-70b 131K $0.03 / 1M tokens $0.13 / 1M tokens
Novita
Novita | deepseek/deepseek-r1-distill-llama-70b 32K $0.8 / 1M tokens $0.8 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by deepseek