OpenAI: GPT-4.1

Image input File input Text input Text output
Author's Description

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Key Specifications
Cost
$$$$$
Context
1M
Parameters
1.8T (Rumoured)
Released
Apr 14, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Structured Outputs Response Format Seed Max Tokens Tool Choice Tools
Features

This model supports the following features:

Response Format Tools Structured Outputs
Performance Summary

OpenAI's GPT-4.1, created April 14, 2025, is a flagship large language model designed for advanced instruction following, real-world software engineering, and long-context reasoning, featuring a 1 million token context window. It performs among the fastest models, ranking in the 77th percentile for speed across 8 benchmarks, and offers moderate pricing, placing in the 26th percentile. Notably, GPT-4.1 demonstrates exceptional reliability with a 100% success rate across all benchmarks, indicating minimal technical failures. The model exhibits outstanding performance in Mathematics (96.0% accuracy, 99th percentile) and Ethics (100.0% accuracy), where it achieves perfect scores and is recognized as the most accurate model at its price point and speed. It also shows strong capabilities in General Knowledge (99.5% accuracy) and Coding (91.0% accuracy, 75th percentile). While its Hallucinations accuracy is 88.0% (32nd percentile), suggesting room for improvement in acknowledging uncertainty, its Instruction Following (76.0% accuracy, 88th percentile) and Reasoning (82.0% accuracy, 73rd percentile) capabilities are robust. Its key strengths lie in its high accuracy in complex problem-solving domains like mathematics and ethics, coupled with its exceptional reliability and long-context reasoning, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $2
Completion $8
Input Cache Read $0.5
Web Search $10000

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
OpenAI
OpenAI | openai/gpt-4.1-2025-04-14 1M $2 / 1M tokens $8 / 1M tokens
Azure
Azure | openai/gpt-4.1-2025-04-14 1M $2 / 1M tokens $8 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai