OpenAI: GPT-4.1

File input Text input Image input Text output
Author's Description

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Key Specifications
Cost
$$$$$
Context
1M
Parameters
1.8T (Rumoured)
Released
Apr 14, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tool Choice Top P Logit Bias Temperature Logprobs Presence Penalty Stop Response Format Structured Outputs Tools Max Tokens Frequency Penalty Top Logprobs Seed
Features

This model supports the following features:

Response Format Tools Structured Outputs
Performance Summary

OpenAI's GPT-4.1, created on April 14, 2025, is a flagship large language model designed for advanced instruction following, real-world software engineering, and long-context reasoning, supporting a 1 million token context window. It performs among the fastest models, typically ranking in the top tier for speed (75th percentile across 7 benchmarks). The model offers moderate pricing, positioned at the 26th percentile across 7 benchmarks. Notably, GPT-4.1 demonstrates exceptional reliability with a perfect 100% success rate across all evaluated benchmarks, indicating minimal technical failures. GPT-4.1 exhibits strong performance across various categories. It achieves perfect accuracy in Ethics (100.0%), making it the most accurate model at its price point and among models of similar speed. It also excels in General Knowledge (99.5% accuracy) and Coding (91.0% accuracy), outperforming GPT-4o and GPT-4.5 in the latter with 54.6% SWE-bench Verified. Instruction Following (76.0% accuracy) and Reasoning (82.0% accuracy) are also significant strengths. While its Hallucinations Baseline accuracy is 88.0%, this places it in the 33rd percentile, suggesting room for improvement in acknowledging uncertainty. Its high recall in large document contexts and precise code diffs make it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $2
Completion $8
Input Cache Read $0.5
Web Search $10000

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
OpenAI
OpenAI | openai/gpt-4.1-2025-04-14 1M $2 / 1M tokens $8 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by openai