OpenAI: gpt-oss-safeguard-20b

Text input Text output
Author's Description

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust & safety labeling. Learn more about this model in OpenAI's gpt-oss-safeguard [user guide](https://cookbook.openai.com/articles/gpt-oss-safeguard-guide).

Key Specifications
Cost
$$$
Context
131K
Parameters
20B
Released
Oct 29, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Top P Tool Choice Seed Response Format Max Tokens Stop Reasoning Tools Temperature
Features

This model supports the following features:

Reasoning Tools Response Format
Performance Summary

The OpenAI gpt-oss-safeguard-20b model, a 21B-parameter Mixture-of-Experts (MoE) model designed for safety reasoning, demonstrates strong overall performance. It consistently ranks among the fastest models, placing in the 84th percentile across eight benchmarks, and offers competitive pricing, typically falling within the 63rd percentile. Notably, its reliability is exceptional, boasting a 99% success rate across all benchmarks, indicating minimal technical failures. In terms of specific performance, gpt-oss-safeguard-20b exhibits key strengths in several areas. It achieves high accuracy in Coding (95%, 96th percentile), Email Classification (99%, 88th percentile), and Reasoning (94%, 85th percentile), with the latter being particularly impressive as it is the most accurate among models of comparable speed. Instruction Following is also strong at 69% accuracy (80th percentile). While its General Knowledge is solid at 97% accuracy, its percentile ranking (50th) suggests a competitive landscape. A notable weakness appears in its handling of Hallucinations, where it achieved 86% accuracy (31st percentile), indicating room for improvement in acknowledging uncertainty. Ethics performance is respectable at 98% accuracy, though its percentile (43rd) is moderate. Mathematics accuracy is 89% (63rd percentile). Overall, the model excels in practical application-oriented tasks and complex reasoning, making it well-suited for its intended safety and content moderation roles.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.075
Completion $0.3
Input Cache Read $0.037

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Groq
Groq | openai/gpt-oss-safeguard-20b 131K $0.075 / 1M tokens $0.3 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai