OpenAI: GPT-5.1-Codex-Max

Image input Text input Text output
Author's Description

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version of the 5.1 reasoning stack and trained on agentic workflows spanning software engineering, mathematics, and research. GPT-5.1-Codex-Max delivers faster performance, improved reasoning, and higher token efficiency across the development lifecycle.

Key Specifications
Cost
$$$$$
Context
400K
Released
Dec 04, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Reasoning Top Logprobs Structured Outputs Seed Logit Bias Stop Max Tokens Response Format Logprobs Frequency Penalty Presence Penalty Tools Include Reasoning Tool Choice
Features

This model supports the following features:

Tools Response Format Structured Outputs Reasoning
Performance Summary

GPT-5.1-Codex-Max, OpenAI’s latest agentic coding model, demonstrates exceptional reliability with a 100% success rate across all benchmarks, indicating consistent operational stability. Its speed performance is moderate, ranking in the 25th percentile, while its pricing is positioned at premium levels, falling into the 6th percentile. The model exhibits strong performance in several key areas. It achieved perfect accuracy in General Knowledge and Ethics, with the latter also being the most accurate and fastest among models at its price point. Instruction Following is a significant strength, with 85% accuracy, placing it in the 95th percentile. Coding performance is also robust at 95% accuracy (93rd percentile), and Mathematics shows solid results at 93% accuracy (73rd percentile). Reasoning capabilities are fair at 80% accuracy. A notable weakness is its performance in Hallucinations, where it achieved only 2.0% accuracy, indicating a tendency to provide definitive answers to fictional concepts rather than acknowledging uncertainty. Email Classification also shows room for improvement at 97% accuracy, placing it in the 42nd percentile. Overall, GPT-5.1-Codex-Max excels in complex reasoning and knowledge-based tasks, particularly within its intended domain of software development, but users should be mindful of its hallucination tendencies.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $1.25
Completion $10
Input Cache Read $0.125

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
OpenAI
OpenAI | openai/gpt-5.1-codex-max-20251204 400K $1.25 / 1M tokens $10 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai