OpenAI: gpt-oss-20b

Text input Text output Free Option
Author's Description

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.

Key Specifications
Cost
$$
Context
131K
Parameters
20B
Released
Aug 05, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Include Reasoning Top Logprobs Tool Choice Max Tokens Reasoning Top P Stop Logprobs Frequency Penalty Temperature Response Format Structured Outputs Presence Penalty Logit Bias
Features

This model supports the following features:

Structured Outputs Reasoning Tools Response Format
Performance Summary

OpenAI's gpt-oss-20b, an open-weight 21B parameter MoE model, consistently ranks among the fastest models and offers highly competitive pricing. Its architecture, with 3.6B active parameters per forward pass, is optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model demonstrates strong performance in several key areas. It achieves excellent accuracy in Ethics (99.0%), General Knowledge (99.0%), and Coding (92.0%), placing it in the 62nd, 71st, and 82nd percentiles respectively. Its Reasoning capabilities are also notable at 89.8% accuracy (80th percentile). However, the model exhibits significant variability in its Keyword Topic Relevance Classification performance, ranging from 0.0% to 100.0% accuracy across different instances of the benchmark. This suggests potential inconsistencies or sensitivity to specific test conditions within this category. A notable weakness is its performance in Mathematics, where it scores 66.7% accuracy (32nd percentile) with a very high duration of over 9 million milliseconds, indicating potential inefficiencies or struggles with complex mathematical problems. Hallucinations are moderately controlled at 70.0% accuracy (17th percentile), suggesting room for improvement in acknowledging uncertainty. The model's support for reasoning level configuration, fine-tuning, and agentic capabilities like function calling and tool use further enhance its versatility.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.07
Completion $0.3

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Fireworks
Fireworks | openai/gpt-oss-20b 131K $0.07 / 1M tokens $0.3 / 1M tokens
Groq
Groq | openai/gpt-oss-20b 131K $0.1 / 1M tokens $0.5 / 1M tokens
Novita
Novita | openai/gpt-oss-20b 131K $0.05 / 1M tokens $0.2 / 1M tokens
Nebius
Nebius | openai/gpt-oss-20b 131K $0.05 / 1M tokens $0.2 / 1M tokens
DeepInfra
DeepInfra | openai/gpt-oss-20b 131K $0.04 / 1M tokens $0.15 / 1M tokens
NCompass
NCompass | openai/gpt-oss-20b 131K $0.04 / 1M tokens $0.15 / 1M tokens
Phala
Phala | openai/gpt-oss-20b 131K $0.03 / 1M tokens $0.15 / 1M tokens
Together
Together | openai/gpt-oss-20b 131K $0.05 / 1M tokens $0.2 / 1M tokens
WandB
WandB | openai/gpt-oss-20b 131K $0.05 / 1M tokens $0.2 / 1M tokens
Hyperbolic
Hyperbolic | openai/gpt-oss-20b 131K $0.1 / 1M tokens $0.1 / 1M tokens
NextBit
NextBit | openai/gpt-oss-20b 131K $0.05 / 1M tokens $0.2 / 1M tokens
InferenceNet
InferenceNet | openai/gpt-oss-20b 131K $0.03 / 1M tokens $0.15 / 1M tokens
Google
Google | openai/gpt-oss-20b 131K $0.075 / 1M tokens $0.3 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai