Anthropic: Claude 3.5 Sonnet

File input Text input Image input Text output Unavailable
Author's Description

New Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at: - Coding: Scores ~49% on SWE-Bench Verified, higher than the last best score, and without any fancy prompt scaffolding - Data science: Augments human data science expertise; navigates unstructured data while using multiple tools for insights - Visual processing: excelling at interpreting charts, graphs, and images, accurately transcribing text to derive insights beyond just the text alone - Agentic tasks: exceptional tool use, making it great at agentic tasks (i.e. complex, multi-step problem solving tasks that require engaging with other systems) #multimodal

Key Specifications
Cost
+$$$$$
Context
200K
Parameters
400B (Rumoured)
Released
Oct 21, 2024
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Top P Temperature Stop Tool Choice Max Tokens
Features

This model supports the following features:

Tools
Performance Summary

Anthropic's Claude 3.5 Sonnet, created on October 21, 2024, demonstrates a strong overall performance profile, particularly excelling in reliability with a perfect 100% success rate across all benchmarks. This indicates exceptional stability and consistent response delivery. The model exhibits competitive response times, ranking in the 52nd percentile for speed, performing among the faster models available. However, its pricing tends to be at premium levels, positioned in the 15th percentile for cost-effectiveness. Sonnet showcases remarkable accuracy in critical areas, achieving perfect scores in Hallucinations, General Knowledge, and Ethics benchmarks, often being the most accurate model at its price point and speed. It also performs very well in Instruction Following (85th percentile accuracy) and Email Classification (98% accuracy). While its Mathematics (89% accuracy) and Reasoning (76% accuracy) scores are solid, they are closer to the average for tested models. The Coding benchmark, despite Anthropic's description of strong coding capabilities, shows 82% accuracy, placing it in the 42nd percentile. Its multimodal capabilities, including visual processing and agentic tasks, are highlighted as key strengths, enabling it to navigate unstructured data and perform complex, multi-step problem-solving.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $6
Completion $30
Input Cache Read $0.6
Input Cache Write $7.5

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Anthropic
Anthropic | anthropic/claude-3.5-sonnet 200K $6 / 1M tokens $30 / 1M tokens
Google
Google | anthropic/claude-3.5-sonnet 200K $6 / 1M tokens $30 / 1M tokens
Amazon Bedrock
Amazon Bedrock | anthropic/claude-3.5-sonnet 200K $6 / 1M tokens $30 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by anthropic