OpenAI: GPT Audio

Audio input Text input Audio output Text output
Author's Description

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced at $32 per million input tokens and $64 per million output tokens.

Key Specifications
Context
128K
Released
Jan 19, 2026
Supported Parameters

This model supports the following parameters:

Logprobs Seed Temperature Stop Max Tokens Structured Outputs Presence Penalty Top Logprobs Response Format Frequency Penalty Logit Bias Top P
Features

This model supports the following features:

Response Format Structured Outputs
Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $2.5
Completion $10

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
OpenAI
OpenAI | openai/gpt-audio 128K $2.5 / 1M tokens $10 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai