OpenAI: GPT-4.1

Image input File input Text input Text output
Author's Description

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Key Specifications
Cost
$$$$$
Context
1M
Parameters
1.8T (Rumoured)
Released
Apr 14, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Logit Bias Tool Choice Seed Top P Top Logprobs Temperature Response Format Logprobs Max Tokens Presence Penalty Structured Outputs Tools Frequency Penalty Stop
Features

This model supports the following features:

Structured Outputs Response Format Tools
Performance Summary

GPT-4.1, released by OpenAI on April 14, 2025, is a flagship large language model designed for advanced instruction following, real-world software engineering, and long-context reasoning. It performs among the fastest models, typically ranking in the top tier for speed (74th percentile). While not the cheapest, it offers moderate pricing (27th percentile). A standout feature is its exceptional reliability, demonstrating minimal technical failures and achieving a perfect 100th percentile. The model excels across various benchmarks. In coding, it achieved 91.0% accuracy, making it the most accurate among models of comparable speed. Its instruction following capabilities are strong, with 76.0% accuracy, placing it in the 92nd percentile. GPT-4.1 also demonstrated perfect accuracy (100.0%) in the Ethics benchmark, notably being the most accurate model at its price point and among models of similar speed. General knowledge is another strength, with 99.5% accuracy. While its Email Classification accuracy (98.0%) is solid, it falls within the 60th percentile for that specific benchmark. Its 1 million token context window and high recall in large document contexts make it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $2
Completion $8
Input Cache Read $0.5

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
OpenAI
OpenAI | openai/gpt-4.1-2025-04-14 1M $2 / 1M tokens $8 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by openai