Meta: Llama Guard 4 12B

Text input Image input Text output
Author's Description

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM—generating text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated. Llama Guard 4 was aligned to safeguard against the standardized MLCommons hazards taxonomy and designed to support multimodal Llama 4 capabilities. Specifically, it combines features from previous Llama Guard models, providing content moderation for English and multiple supported languages, along with enhanced capabilities to handle mixed text-and-image prompts, including multiple images. Additionally, Llama Guard 4 is integrated into the Llama Moderations API, extending robust safety classification to text and images.

Key Specifications
Cost
$$
Context
163K
Parameters
12B
Released
Apr 29, 2025
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Response Format Stop Seed Min P Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Features

This model supports the following features:

Response Format
Performance Summary

Meta's Llama Guard 4 12B, a multimodal model derived from Llama 4 Scout and fine-tuned for content safety classification, demonstrates exceptional performance in operational efficiency. It consistently ranks among the fastest models, achieving an Infinityth percentile across five benchmarks, indicating unparalleled speed. Similarly, its pricing is highly competitive, also securing an Infinityth percentile across five benchmarks. However, the model's performance on the provided benchmark results for General Knowledge, Ethics, Email Classification, Instruction Following, and Coding is uniformly 0.0% accuracy. This suggests that while Llama Guard 4 excels in speed and cost-effectiveness, its current iteration, as evaluated by these specific benchmarks, does not demonstrate proficiency in general cognitive tasks, ethical reasoning, classification, instruction adherence, or coding. Its core strength lies in its intended purpose as a content safety classifier, leveraging multimodal capabilities for English and multiple languages, including handling mixed text-and-image prompts. The benchmark results likely reflect its specialized design for safety classification rather than general-purpose AI tasks. Its integration into the Llama Moderations API further underscores its role as a dedicated safety tool.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.18
Completion $0.18

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | meta-llama/llama-guard-4-12b 163K $0.18 / 1M tokens $0.18 / 1M tokens
Together
Together | meta-llama/llama-guard-4-12b 1M $0.2 / 1M tokens $0.2 / 1M tokens
Groq
Groq | meta-llama/llama-guard-4-12b 131K $0.2 / 1M tokens $0.2 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by meta-llama