Qwen2.5 72B Instruct

Text input Text output Free Option
Author's Description

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. - Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. - Long-context Support up to 128K tokens and can generate up to 8K tokens. - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Key Specifications
Cost
$$$
Context
32K
Parameters
500B (Rumoured)
Released
Sep 18, 2024
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Presence Penalty Tool Choice Top P Temperature Seed Min P Tools Response Format Frequency Penalty Max Tokens
Features

This model supports the following features:

Tools Response Format
Performance Summary

Qwen2.5 72B Instruct, created on September 18, 2024, demonstrates strong overall performance. It consistently ranks among the fastest models, achieving an Infinityth percentile in speed across seven benchmarks. In terms of cost, it typically provides cost-effective solutions, ranking in the 65th percentile across six benchmarks. The model exhibits exceptional reliability with a 99% success rate across seven benchmarks, indicating minimal technical failures. Analysis of benchmark results reveals several strengths. Qwen2.5 72B achieved perfect accuracy in the Ethics (Baseline) benchmark, also standing out as the most accurate model at its price point and among models of comparable speed. It demonstrated very high accuracy in General Knowledge (99.5%) and strong performance in Email Classification (98.0%). The model shows improved capabilities in coding (85.0% accuracy) and mathematics, aligning with its description of specialized expert models in these domains. While one Instruction Following benchmark showed 0.0% accuracy, another achieved a respectable 67.0%, suggesting some variability or specific challenge in the former. Its performance in Reasoning (67.3%) indicates solid analytical capabilities. The model's described improvements in instruction following, long text generation, and structured data understanding are partially reflected in its benchmark results, though the 0% instruction following score warrants further investigation.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.12
Completion $0.39

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | qwen/qwen-2.5-72b-instruct 32K $0.12 / 1M tokens $0.39 / 1M tokens
Nebius
Nebius | qwen/qwen-2.5-72b-instruct 131K $0.13 / 1M tokens $0.4 / 1M tokens
Novita
Novita | qwen/qwen-2.5-72b-instruct 32K $0.38 / 1M tokens $0.4 / 1M tokens
Hyperbolic
Hyperbolic | qwen/qwen-2.5-72b-instruct 131K $0.4 / 1M tokens $0.4 / 1M tokens
Fireworks
Fireworks | qwen/qwen-2.5-72b-instruct 32K $0.0666 / 1M tokens $0.267 / 1M tokens
Together
Together | qwen/qwen-2.5-72b-instruct 131K $1.2 / 1M tokens $1.2 / 1M tokens
Chutes
Chutes | qwen/qwen-2.5-72b-instruct 32K $0.0666 / 1M tokens $0.267 / 1M tokens
NextBit
NextBit | qwen/qwen-2.5-72b-instruct 65K $0.0666 / 1M tokens $0.267 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by qwen