Meta: Llama 3.1 405B Instruct

Text input Text output Free Option
Author's Description

The highly anticipated 400B class of Llama3 is here! Clocking in at 128k context with impressive eval scores, the Meta AI team continues to push the frontier of open-source LLMs. Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 405B instruct-tuned version is optimized for high quality dialogue usecases. It has demonstrated strong performance compared to leading closed-source models including GPT-4o and Claude 3.5 Sonnet in evaluations. To read more about the model release, [click here](https://ai.meta.com/blog/meta-llama-3-1/). Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/).

Key Specifications
Cost
$$$$
Context
32K
Parameters
405B
Released
Jul 22, 2024
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Presence Penalty Tool Choice Top P Temperature Seed Min P Tools Response Format Frequency Penalty Max Tokens
Features

This model supports the following features:

Tools Response Format
Performance Summary

Meta's Llama 3.1 405B Instruct model, released on July 22, 2024, demonstrates strong overall performance, particularly excelling in speed and reliability. It consistently ranks among the fastest models, achieving an "Infinityth percentile" across seven benchmarks, indicating top-tier processing efficiency. Priced competitively, it falls within the 45th percentile across six benchmarks, offering a good balance of cost-effectiveness. Reliability is a significant strength, with a 96% success rate across seven benchmarks, signifying minimal technical failures and consistent response delivery. Across benchmark categories, Llama 3.1 405B Instruct shows notable strengths in Classification and Ethics, achieving perfect 100% accuracy in both Email Classification and Ethics (Baseline) benchmarks. These results are further highlighted by its "Most accurate model at this price point" and "Most accurate among models this fast" accolades in these categories. While its Instruction Following (Baseline) accuracy varied significantly (0% in one instance, 60% in another), suggesting potential inconsistencies or specific test case sensitivities, its Reasoning (66% accuracy) and General Knowledge (89.5% accuracy) performance are solid. Coding (Baseline) is a relative weakness at 69% accuracy, placing it in the 34th percentile. Overall, the model is well-suited for high-quality dialogue use cases, leveraging its strong reliability and impressive performance in critical areas like classification and ethical reasoning.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.8
Completion $0.8

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | meta-llama/llama-3.1-405b-instruct 32K $0.8 / 1M tokens $0.8 / 1M tokens
Lambda
Lambda | meta-llama/llama-3.1-405b-instruct 131K $0.8 / 1M tokens $0.8 / 1M tokens
Nebius
Nebius | meta-llama/llama-3.1-405b-instruct 131K $1 / 1M tokens $3 / 1M tokens
Fireworks
Fireworks | meta-llama/llama-3.1-405b-instruct 131K $3 / 1M tokens $3 / 1M tokens
Together
Together | meta-llama/llama-3.1-405b-instruct 130K $3.5 / 1M tokens $3.5 / 1M tokens
Hyperbolic
Hyperbolic | meta-llama/llama-3.1-405b-instruct 131K $4 / 1M tokens $4 / 1M tokens
SambaNova
SambaNova | meta-llama/llama-3.1-405b-instruct 16K $0.8 / 1M tokens $0.8 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by meta-llama