Anthropic: Claude Opus 4

File input Text input Image input Text output
Author's Description

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and Terminal-bench (43.2%). Opus 4 supports extended, agentic workflows, handling thousands of task steps continuously for hours without degradation. Read more at the [blog post here](https://www.anthropic.com/news/claude-4)

Key Specifications
Cost
+$$$$$
Context
200K
Released
May 22, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Stop Tool Choice Temperature Tools Reasoning Max Tokens
Features

This model supports the following features:

Tools Reasoning
Performance Summary

Claude Opus 4, released on May 22, 2025, is positioned as a leading AI model, particularly in coding and complex agent workflows. While its speed performance is moderate, ranking in the 29th percentile, it demonstrates exceptional reliability, achieving a perfect 100th percentile, indicating minimal technical failures and consistent evaluable responses. However, its pricing is at a premium level, placing it in the 4th percentile for cost competitiveness. The model excels across several benchmark categories. It achieved perfect 100% accuracy in both Ethics (Baseline) and General Knowledge (Baseline), often being the most accurate and fastest among models at its price point in these areas. Its coding capabilities are strong, with 94% accuracy in the Coding (Baseline) benchmark, aligning with its claim as a top coding model. Reasoning (Baseline) also shows high proficiency at 98% accuracy. Instruction Following (Baseline) is solid at 71% accuracy, and Email Classification (Baseline) is competent at 97%. Opus 4's key strengths lie in its high accuracy across diverse cognitive tasks, particularly in ethics, general knowledge, coding, and complex reasoning, coupled with its outstanding reliability. Its primary weakness is its premium pricing, which may limit accessibility for some users.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $15
Completion $75
Input Cache Read $1.5
Input Cache Write $18.8

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Anthropic
Anthropic | anthropic/claude-4-opus-20250522 200K $15 / 1M tokens $75 / 1M tokens
Amazon Bedrock
Amazon Bedrock | anthropic/claude-4-opus-20250522 200K $15 / 1M tokens $75 / 1M tokens
Google
Google | anthropic/claude-4-opus-20250522 200K $15 / 1M tokens $75 / 1M tokens
Google
Google | anthropic/claude-4-opus-20250522 200K $15 / 1M tokens $75 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by anthropic