Meta: Llama 4 Scout

Text input Image input Text output
Author's Description

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, Llama 4 Scout incorporates early fusion for seamless modality integration. It is instruction-tuned for use in multilingual chat, captioning, and image understanding tasks. Released under the Llama 4 Community License, it was last trained on data up to August 2024 and launched publicly on April 5, 2025.

Key Specifications
Cost
$$
Context
327K
Parameters
17B
Released
Apr 05, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tool Choice Response Format Seed Top P Temperature Top Logprobs Tools Logit Bias Logprobs Stop Min P Max Tokens Frequency Penalty Presence Penalty
Features

This model supports the following features:

Tools Response Format
Performance Summary

Meta's Llama 4 Scout 17B Instruct (16E) demonstrates a balanced performance profile, excelling in efficiency and reliability. It performs among the fastest models, ranking in the top tier for speed (60th percentile), and offers highly competitive pricing, placing in the 76th percentile for cost-effectiveness. The model exhibits strong reliability with a 93% success rate, indicating consistent and usable responses. In terms of specific capabilities, Llama 4 Scout shows particular strength in Email Classification, achieving 99% accuracy, and performs well in General Knowledge (97%) and Ethics (98%). Its multilingual and multimodal design, supporting text and image input with multilingual text and code output, positions it well for diverse applications. However, the model struggles with tasks requiring deep mathematical understanding (39% accuracy) and complex instruction following (38.7%). Its hallucination rate is also a notable weakness, with only 68% accuracy in acknowledging uncertainty. Despite these areas for improvement, its strong performance in coding (79.5%) and reasoning (58%), combined with its efficiency and reliability, make it a robust option for assistant-style interactions and visual reasoning.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.08
Completion $0.3

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Lambda
Lambda | meta-llama/llama-4-scout-17b-16e-instruct 1M $0.08 / 1M tokens $0.3 / 1M tokens
DeepInfra
DeepInfra | meta-llama/llama-4-scout-17b-16e-instruct 327K $0.08 / 1M tokens $0.3 / 1M tokens
Kluster
Kluster | meta-llama/llama-4-scout-17b-16e-instruct 131K $0.08 / 1M tokens $0.3 / 1M tokens
GMICloud
GMICloud | meta-llama/llama-4-scout-17b-16e-instruct 1M $0.08 / 1M tokens $0.5 / 1M tokens
Parasail
Parasail | meta-llama/llama-4-scout-17b-16e-instruct 158K $0.08 / 1M tokens $0.3 / 1M tokens
Cent-ML
Cent-ML | meta-llama/llama-4-scout-17b-16e-instruct 1M $0.08 / 1M tokens $0.3 / 1M tokens
Novita
Novita | meta-llama/llama-4-scout-17b-16e-instruct 131K $0.1 / 1M tokens $0.5 / 1M tokens
Groq
Groq | meta-llama/llama-4-scout-17b-16e-instruct 131K $0.11 / 1M tokens $0.34 / 1M tokens
BaseTen
BaseTen | meta-llama/llama-4-scout-17b-16e-instruct 1M $0.08 / 1M tokens $0.3 / 1M tokens
Fireworks
Fireworks | meta-llama/llama-4-scout-17b-16e-instruct 1M $0.15 / 1M tokens $0.6 / 1M tokens
Together
Together | meta-llama/llama-4-scout-17b-16e-instruct 1M $0.18 / 1M tokens $0.59 / 1M tokens
Google
Google | meta-llama/llama-4-scout-17b-16e-instruct 1M $0.25 / 1M tokens $0.7 / 1M tokens
SambaNova
SambaNova | meta-llama/llama-4-scout-17b-16e-instruct 8K $0.08 / 1M tokens $0.3 / 1M tokens
Cerebras
Cerebras | meta-llama/llama-4-scout-17b-16e-instruct 32K $0.65 / 1M tokens $0.85 / 1M tokens
BaseTen
BaseTen | meta-llama/llama-4-scout-17b-16e-instruct 1M $0.13 / 1M tokens $0.5 / 1M tokens
Friendli
Friendli | meta-llama/llama-4-scout-17b-16e-instruct 131K $0.1 / 1M tokens $0.6 / 1M tokens
DeepInfra
DeepInfra | meta-llama/llama-4-scout-17b-16e-instruct 327K $0.08 / 1M tokens $0.3 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by meta-llama