OpenAI: gpt-oss-20b

Text input Text output Free Option
Author's Description

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.

Key Specifications
Cost
$$
Context
131K
Parameters
20B
Released
Aug 05, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Tool Choice Stop Logit Bias Structured Outputs Response Format Presence Penalty Reasoning Temperature Max Tokens Tools Top P Frequency Penalty
Features

This model supports the following features:

Tools Reasoning Response Format Structured Outputs
Performance Summary

OpenAI's gpt-oss-20b demonstrates exceptional speed, consistently ranking among the fastest models, and offers highly competitive pricing. This open-weight, 21B parameter model, utilizing a Mixture-of-Experts (MoE) architecture, is optimized for lower-latency inference and deployability on consumer hardware. The model exhibits strong performance in several key areas. It achieved perfect accuracy in two instances of Keyword Topic Relevance Classification, showcasing excellent semantic understanding and conceptual relationship recognition, with one instance also being the fastest recorded. Its General Knowledge and Coding capabilities are impressive, scoring 99.0% (70th percentile) and 92.0% (81th percentile) respectively. The model also performs well in Ethics (99.0% accuracy, 62nd percentile), Instruction Following (66.0% accuracy, 74th percentile), and Reasoning (89.8% accuracy, 80th percentile). However, gpt-oss-20b shows significant weaknesses in certain Keyword Topic Relevance Classification tasks, with multiple 0.0% accuracy scores, indicating inconsistency or specific challenges in those particular test sets. Its Mathematics performance is moderate at 66.7% accuracy (31st percentile) and notably slow, ranking in the 2nd percentile for duration. Hallucinations are also a concern, with 70.0% accuracy (17th percentile), suggesting it sometimes fails to acknowledge uncertainty.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.07
Completion $0.3

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Fireworks
Fireworks | openai/gpt-oss-20b 131K $0.07 / 1M tokens $0.3 / 1M tokens
Groq
Groq | openai/gpt-oss-20b 131K $0.075 / 1M tokens $0.3 / 1M tokens
Novita
Novita | openai/gpt-oss-20b 131K $0.03 / 1M tokens $0.14 / 1M tokens
Nebius
Nebius | openai/gpt-oss-20b 131K $0.05 / 1M tokens $0.2 / 1M tokens
DeepInfra
DeepInfra | openai/gpt-oss-20b 131K $0.03 / 1M tokens $0.14 / 1M tokens
NCompass
NCompass | openai/gpt-oss-20b 131K $0.04 / 1M tokens $0.15 / 1M tokens
Phala
Phala | openai/gpt-oss-20b 131K $0.04 / 1M tokens $0.15 / 1M tokens
Together
Together | openai/gpt-oss-20b 131K $0.05 / 1M tokens $0.2 / 1M tokens
WandB
WandB | openai/gpt-oss-20b 131K $0.05 / 1M tokens $0.2 / 1M tokens
Hyperbolic
Hyperbolic | openai/gpt-oss-20b 131K $0.04 / 1M tokens $0.04 / 1M tokens
NextBit
NextBit | openai/gpt-oss-20b 131K $0.1 / 1M tokens $0.45 / 1M tokens
InferenceNet
InferenceNet | openai/gpt-oss-20b 100K $0.03 / 1M tokens $0.14 / 1M tokens
Google
Google | openai/gpt-oss-20b 131K $0.07 / 1M tokens $0.25 / 1M tokens
SiliconFlow
SiliconFlow | openai/gpt-oss-20b 131K $0.04 / 1M tokens $0.18 / 1M tokens
Novita
Novita | openai/gpt-oss-20b 131K $0.04 / 1M tokens $0.15 / 1M tokens
Parasail
Parasail | openai/gpt-oss-20b 131K $0.04 / 1M tokens $0.2 / 1M tokens
Amazon Bedrock
Amazon Bedrock | openai/gpt-oss-20b 131K $0.07 / 1M tokens $0.15 / 1M tokens
Clarifai
Clarifai | openai/gpt-oss-20b 131K $0.045 / 1M tokens $0.18 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai