OpenAI: GPT-4o Audio

Audio input Text input Text output
Author's Description

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input audio tokens.

Key Specifications
Cost
$$$$$
Context
128K
Parameters
200B (Rumoured)
Released
Aug 14, 2025
Supported Parameters

This model supports the following parameters:

Top Logprobs Logprobs Logit Bias Stop Seed Top P Response Format Structured Outputs Frequency Penalty Tool Choice Max Tokens Tools Presence Penalty Temperature
Features

This model supports the following features:

Response Format Tools Structured Outputs
Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $2.5
Completion $10

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
OpenAI
OpenAI | openai/gpt-4o-audio-preview 128K $2.5 / 1M tokens $10 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by openai