Meta: Llama 3.3 70B Instruct

Text input Text output Free Option
Author's Description

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. [Model Card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md)

Key Specifications
Cost
$
Context
131K
Parameters
70B
Released
Dec 06, 2024
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Presence Penalty Tool Choice Top P Temperature Seed Min P Tools Response Format Frequency Penalty Max Tokens
Features

This model supports the following features:

Tools Response Format
Performance Summary

The Meta Llama 3.3 70B Instruct model demonstrates strong overall performance, particularly excelling in cost-efficiency and reliability. It consistently offers among the most competitive pricing, ranking in the 85th percentile across benchmarks. Furthermore, its reliability is exceptionally high, with a 92% success rate, indicating consistent and usable responses. While its speed generally places it in the top tier (60th percentile), individual benchmark durations vary. In terms of specific capabilities, Llama 3.3 70B Instruct shows remarkable strength in Instruction Following, achieving perfect accuracy in one instance and high accuracy in another, often performing among the fastest models for this task. It also exhibits perfect accuracy in Email Classification, proving to be highly accurate and cost-effective. The model performs very well in Ethics and General Knowledge, with accuracies of 99% and 98% respectively, showcasing robust understanding in these domains. Its Reasoning abilities are moderate, with 58% accuracy. A notable weakness is observed in Coding, where it achieved only 37% accuracy, placing it in the lower percentile for this category. Overall, the model is highly optimized for multilingual dialogue and excels in tasks requiring precise instruction adherence and classification, while its coding proficiency could be an area for future enhancement.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.038
Completion $0.12

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | meta-llama/llama-3.3-70b-instruct 131K $0.038 / 1M tokens $0.12 / 1M tokens
Kluster
Kluster | meta-llama/llama-3.3-70b-instruct 131K $0.038 / 1M tokens $0.12 / 1M tokens
Lambda
Lambda | meta-llama/llama-3.3-70b-instruct 131K $0.12 / 1M tokens $0.3 / 1M tokens
Phala
Phala | meta-llama/llama-3.3-70b-instruct 131K $0.1 / 1M tokens $0.25 / 1M tokens
Novita
Novita | meta-llama/llama-3.3-70b-instruct 131K $0.13 / 1M tokens $0.39 / 1M tokens
Crusoe
Crusoe | meta-llama/llama-3.3-70b-instruct 131K $0.038 / 1M tokens $0.12 / 1M tokens
Nebius
Nebius | meta-llama/llama-3.3-70b-instruct 131K $0.13 / 1M tokens $0.4 / 1M tokens
DeepInfra
DeepInfra | meta-llama/llama-3.3-70b-instruct 131K $0.23 / 1M tokens $0.4 / 1M tokens
Parasail
Parasail | meta-llama/llama-3.3-70b-instruct 131K $0.15 / 1M tokens $0.5 / 1M tokens
NextBit
NextBit | meta-llama/llama-3.3-70b-instruct 32K $0.038 / 1M tokens $0.12 / 1M tokens
Cloudflare
Cloudflare | meta-llama/llama-3.3-70b-instruct 24K $0.29 / 1M tokens $2.25 / 1M tokens
Cent-ML
Cent-ML | meta-llama/llama-3.3-70b-instruct 131K $0.038 / 1M tokens $0.12 / 1M tokens
InoCloud
InoCloud | meta-llama/llama-3.3-70b-instruct 131K $0.038 / 1M tokens $0.12 / 1M tokens
Hyperbolic
Hyperbolic | meta-llama/llama-3.3-70b-instruct 131K $0.4 / 1M tokens $0.4 / 1M tokens
Atoma
Atoma | meta-llama/llama-3.3-70b-instruct 104K $0.038 / 1M tokens $0.12 / 1M tokens
Groq
Groq | meta-llama/llama-3.3-70b-instruct 131K $0.59 / 1M tokens $0.79 / 1M tokens
Friendli
Friendli | meta-llama/llama-3.3-70b-instruct 131K $0.6 / 1M tokens $0.6 / 1M tokens
SambaNova
SambaNova | meta-llama/llama-3.3-70b-instruct 131K $0.6 / 1M tokens $1.2 / 1M tokens
Google
Google | meta-llama/llama-3.3-70b-instruct 128K $0.72 / 1M tokens $0.72 / 1M tokens
Cerebras
Cerebras | meta-llama/llama-3.3-70b-instruct 131K $0.85 / 1M tokens $1.2 / 1M tokens
Together
Together | meta-llama/llama-3.3-70b-instruct 131K $0.88 / 1M tokens $0.88 / 1M tokens
Fireworks
Fireworks | meta-llama/llama-3.3-70b-instruct 131K $0.9 / 1M tokens $0.9 / 1M tokens
InferenceNet
InferenceNet | meta-llama/llama-3.3-70b-instruct 128K $0.038 / 1M tokens $0.12 / 1M tokens
Crusoe
Crusoe | meta-llama/llama-3.3-70b-instruct 131K $0.039 / 1M tokens $0.12 / 1M tokens
GMICloud
GMICloud | meta-llama/llama-3.3-70b-instruct 131K $0.25 / 1M tokens $0.75 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by meta-llama