OpenAI: GPT Audio

Text input Audio input Text output Audio output
Author's Description

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced at $32 per million input tokens and $64 per million output tokens.

Key Specifications
Cost
$$$$$
Context
128K
Released
Jan 19, 2026
Supported Parameters

This model supports the following parameters:

Response Format Presence Penalty Logprobs Top P Frequency Penalty Max Tokens Top Logprobs Structured Outputs Logit Bias Seed Stop Temperature
Features

This model supports the following features:

Structured Outputs Response Format
Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $2.5
Completion $10

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
OpenAI
OpenAI | openai/gpt-audio 128K $2.5 / 1M tokens $10 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai