Qwen2.5 72B Instruct

Text input Text output Free Option
Author's Description

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. - Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. - Long-context Support up to 128K tokens and can generate up to 8K tokens. - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Key Specifications
Cost
$$
Context
32K
Parameters
500B (Rumoured)
Released
Sep 18, 2024
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Tool Choice Response Format Stop Seed Min P Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Features

This model supports the following features:

Tools Response Format
Performance Summary

Qwen2.5 72B Instruct, released on September 18, 2024, demonstrates exceptional performance across several key metrics. It consistently ranks among the fastest models available and offers competitive pricing, typically providing cost-effective solutions. The model exhibits outstanding reliability with a 99% success rate, indicating minimal technical failures. In terms of capabilities, Qwen2.5 72B shows perfect accuracy in Hallucinations (Baseline) and Ethics (Baseline) benchmarks, highlighting its strong ability to acknowledge uncertainty and adhere to ethical principles. Its General Knowledge is robust at 99.5% accuracy. The model also performs well in Coding (85.0% accuracy) and Reasoning (74.0% accuracy). A significant strength is its improved instruction following, achieving 67.0% accuracy in the more complex benchmark, a notable improvement over its predecessor. It also boasts strong multilingual support and enhanced long-context capabilities up to 128K tokens. While its Mathematics performance at 83.8% is solid, it is not a top-tier result compared to other benchmarks. The initial Instruction Following (Baseline) benchmark showing 0.0% accuracy appears to be an anomaly or a misinterpretation of the test, as the subsequent, more detailed Instruction Following benchmark shows a strong 67.0% accuracy.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.12
Completion $0.39

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | qwen/qwen-2.5-72b-instruct 32K $0.12 / 1M tokens $0.39 / 1M tokens
Nebius
Nebius | qwen/qwen-2.5-72b-instruct 131K $0.13 / 1M tokens $0.4 / 1M tokens
Novita
Novita | qwen/qwen-2.5-72b-instruct 32K $0.38 / 1M tokens $0.4 / 1M tokens
Hyperbolic
Hyperbolic | qwen/qwen-2.5-72b-instruct 131K $0.4 / 1M tokens $0.4 / 1M tokens
Fireworks
Fireworks | qwen/qwen-2.5-72b-instruct 32K $0.07 / 1M tokens $0.26 / 1M tokens
Together
Together | qwen/qwen-2.5-72b-instruct 131K $1.2 / 1M tokens $1.2 / 1M tokens
Chutes
Chutes | qwen/qwen-2.5-72b-instruct 32K $0.07 / 1M tokens $0.26 / 1M tokens
NextBit
NextBit | qwen/qwen-2.5-72b-instruct 65K $0.07 / 1M tokens $0.26 / 1M tokens
Chutes
Chutes | qwen/qwen-2.5-72b-instruct 32K $0.07 / 1M tokens $0.26 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by qwen