Meta: Llama Guard 4 12B

Text input Image input Text output
Author's Description

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM—generating text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated. Llama Guard 4 was aligned to safeguard against the standardized MLCommons hazards taxonomy and designed to support multimodal Llama 4 capabilities. Specifically, it combines features from previous Llama Guard models, providing content moderation for English and multiple supported languages, along with enhanced capabilities to handle mixed text-and-image prompts, including multiple images. Additionally, Llama Guard 4 is integrated into the Llama Moderations API, extending robust safety classification to text and images.

Key Specifications
Cost
$
Context
163K
Parameters
12B
Released
Apr 29, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Presence Penalty Top P Temperature Seed Min P Response Format Frequency Penalty Max Tokens
Features

This model supports the following features:

Response Format
Performance Summary

Meta's Llama Guard 4 12B, a Llama 4 Scout-derived model fine-tuned for content safety classification, demonstrates exceptional speed and cost efficiency. It consistently ranks among the fastest models and offers highly competitive pricing, placing it in the Infinityth percentile across six benchmarks for both speed and price. This model is specifically designed for content moderation, classifying LLM inputs and responses against the MLCommons hazards taxonomy, and supports multimodal capabilities including text and multiple images across English and other languages. While its core function is safety classification, the provided benchmark results, which appear to test general-purpose LLM capabilities, show a mixed performance. Llama Guard 4 achieved a notable 72.0% accuracy in Reasoning, indicating a strong ability in complex problem-solving, and did so with excellent cost and duration efficiency within that category. However, it scored 0.0% accuracy across Coding, Instruction Following, Email Classification, Ethics, and General Knowledge benchmarks. This suggests that while highly optimized for its specialized safety classification task, it is not intended or optimized for general-purpose generative or analytical tasks typically associated with large language models. Its integration into the Llama Moderations API further solidifies its role as a robust safety tool.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.18
Completion $0.18

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | meta-llama/llama-guard-4-12b 163K $0.18 / 1M tokens $0.18 / 1M tokens
Together
Together | meta-llama/llama-guard-4-12b 1M $0.2 / 1M tokens $0.2 / 1M tokens
Groq
Groq | meta-llama/llama-guard-4-12b 131K $0.2 / 1M tokens $0.2 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by meta-llama