Meta: Llama 3.2 90B Vision Instruct

Name: Meta: Llama 3.2 90B Vision Instruct
Brand: meta-llama
Availability: OutOfStock
Rating: 2.0 (7 reviews)

Back

Image input Text input Text output Unavailable

Author's Description

The Llama 90B Vision model is a top-tier, 90-billion-parameter multimodal model designed for the most challenging visual reasoning and language tasks. It offers unparalleled accuracy in image captioning, visual question answering, and advanced image-text comprehension. Pre-trained on vast multimodal datasets and fine-tuned with human feedback, the Llama 90B Vision is engineered to handle the most demanding image-based AI tasks. This model is perfect for industries requiring cutting-edge multimodal AI capabilities, particularly those dealing with complex, real-time visual and textual analysis. Click here for the [original model card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD_VISION.md). Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).

Key Specifications

Cost

$$$$

Context

131K

Parameters

90B

Released

Sep 24, 2024

Speed

★★★

Ability

★★

Reliability

★

Hugging Face

Supported Parameters

This model supports the following parameters:

Stop Max Tokens Logit Bias Top P Min P Frequency Penalty Presence Penalty Temperature

Performance Summary

Meta's Llama 3.2 90B Vision Instruct model demonstrates a strong overall performance profile, particularly excelling in reliability with an 84% success rate, indicating consistent and usable responses. In terms of speed, it generally performs in the top tier, ranking in the 63rd percentile, while offering competitive pricing at the 41st percentile. The model exhibits notable strengths in classification tasks, achieving 99.0% accuracy in Email Classification, placing it in the 80th percentile for that benchmark. It also performs well in Ethics (99.0% accuracy) and General Knowledge (97.5% accuracy), suggesting a robust understanding across diverse informational domains. However, its performance in Instruction Following (51.0% accuracy) and Reasoning (56.0% accuracy) is more moderate, indicating areas where further refinement could enhance its capabilities. A significant weakness is observed in the Coding benchmark, where it achieved only 6.0% accuracy, placing it in the 12th percentile. This suggests the model is not well-suited for programming-related tasks. Despite this, its high reliability and strong performance in visual and textual comprehension tasks make it a valuable asset for industries requiring advanced multimodal AI.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0.35
Completion	$0.4

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
Together	Together \| meta-llama/llama-3.2-90b-vision-instruct	131K	$0.35 / 1M tokens	$0.4 / 1M tokens
DeepInfra	DeepInfra \| meta-llama/llama-3.2-90b-vision-instruct	32K	$0.35 / 1M tokens	$0.4 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by meta-llama

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
Meta: Llama Guard 4 12B	Apr 29, 2025	12B	163K	Image input Text input Text output	—	★	$$
Meta: Llama 4 Maverick	Apr 05, 2025	17B	1M	Image input Text input Text output	★★★★★	★★★	$$
Meta: Llama 4 Scout	Apr 05, 2025	17B	327K	Image input Text input Text output	★★★★	★★	$$
Llama Guard 3 8B Unavailable	Feb 12, 2025	8B	131K	Text input Text output	★★	★	$
Meta: Llama 3.3 70B Instruct	Dec 06, 2024	70B	131K	Text input Text output	★★★★	★★★★	$
Meta: Llama 3.2 1B Instruct	Sep 24, 2024	1B	131K	Text input Text output	★★	★	$
Meta: Llama 3.2 3B Instruct	Sep 24, 2024	3B	131K	Text input Text output	★★★★	★	$
Meta: Llama 3.2 11B Vision Instruct	Sep 24, 2024	11B	128K	Image input Text input Text output	★★★	★	$$
Meta: Llama 3.1 405B (base) Unavailable	Aug 01, 2024	405B	32K	Text input Text output	★	★	$$$
Meta: Llama 3.1 70B Instruct	Jul 22, 2024	70B	131K	Text input Text output	★★★★	★★	$$
Meta: Llama 3.1 405B Instruct Unavailable	Jul 22, 2024	405B	32K	Text input Text output	★★★★	★★	$$$
Meta: Llama 3.1 8B Instruct	Jul 22, 2024	8B	131K	Text input Text output	★★★	★★	$
Meta: LlamaGuard 2 8B Unavailable	May 12, 2024	8B	8K	Text input Text output	★★★★	★	$$
Meta: Llama 3 8B Instruct	Apr 17, 2024	8B	8K	Text input Text output	★★★★	★★	$
Meta: Llama 3 70B Instruct Unavailable	Apr 17, 2024	70B	8K	Text input Text output	★★★★	★★	$$
Meta: Llama 2 70B Chat Unavailable	Jun 19, 2023	70B	4K	Text input Text output	—	—	$$$$