OpenAI: GPT-5.3-Codex

Image input Text input Text output
Author's Description

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2. It achieves state-of-the-art results on SWE-Bench Pro and strong performance on Terminal-Bench 2.0 and OSWorld-Verified, reflecting improved multi-language coding, terminal proficiency, and real-world computer-use skills. The model is optimized for long-running, tool-using workflows and supports interactive steering during execution, making it suitable for complex development tasks, debugging, deployment, and iterative product work. Beyond coding, GPT-5.3-Codex performs strongly on structured knowledge-work benchmarks such as GDPval, supporting tasks like document drafting, spreadsheet analysis, slide creation, and operational research across domains. It is trained with enhanced cybersecurity awareness, including vulnerability identification capabilities, and deployed with additional safeguards for high-risk use cases. Compared to prior Codex models, it is more token-efficient and approximately 25% faster, targeting professional end-to-end workflows that span reasoning, execution, and computer interaction.

Key Specifications
Cost
$$$$$
Context
400K
Released
Feb 24, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Response Format Structured Outputs Tool Choice Seed Max Tokens Reasoning Include Reasoning
Features

This model supports the following features:

Reasoning Structured Outputs Tools Response Format
Performance Summary

GPT-5.3-Codex, OpenAI's advanced agentic coding model, demonstrates strong overall performance, particularly in specialized domains. It typically performs among the fastest models, ranking in the 61st percentile for speed, though it tends to be positioned at premium pricing levels (15th percentile). The model exhibits exceptional reliability, achieving a 100% success rate across all benchmarks, indicating minimal technical failures. Its core strength lies in coding and knowledge-based tasks. It achieved a remarkable 96.0% accuracy in the Coding (Baseline) benchmark, placing it in the 97th percentile, and perfect 100.0% accuracy in General Knowledge, where it was noted as the most accurate model at its price point and among models of similar speed. The model also shows strong reasoning capabilities (96.0% accuracy, 86th percentile) and robust instruction following (88.0% accuracy, 95th percentile). A notable strength is its low hallucination rate, with 98.0% accuracy, suggesting a strong ability to acknowledge uncertainty. While its Email Classification accuracy (97.0%) is solid, its percentile ranking (41st) suggests some room for improvement compared to peers. Its Mathematics performance (93.0% accuracy, 71st percentile) is also commendable. Overall, GPT-5.3-Codex excels in complex, multi-faceted tasks requiring deep understanding and precise execution, aligning with its description as an agentic coding model optimized for long-running workflows and professional knowledge work.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $1.75
Completion $14
Input Cache Read $0.175
Web Search $10000

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
OpenAI
OpenAI | openai/gpt-5.3-codex-20260224 400K $1.75 / 1M tokens $14 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai