Author's Description
Qwen2.5 VL 7B is a multimodal LLM from the Qwen Team with the following key enhancements: - SoTA understanding of images of various resolution & ratio: Qwen2.5-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. - Understanding videos of 20min+: Qwen2.5-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. - Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2.5-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. - Multilingual Support: to serve global users, besides English and Chinese, Qwen2.5-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. For more details, see this [blog post](https://qwenlm.github.io/blog/qwen2-vl/) and [GitHub repo](https://github.com/QwenLM/Qwen2-VL). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Key Specifications
Supported Parameters
This model supports the following parameters:
Performance Summary
Qwen2.5-VL 7B Instruct demonstrates exceptional speed, consistently ranking among the fastest models across various benchmarks. It also offers competitive pricing, typically providing cost-effective solutions. The model exhibits strong reliability with an 86% success rate, indicating consistent delivery of usable responses. In terms of performance across categories, Qwen2.5-VL 7B Instruct achieves perfect accuracy in Ethics, highlighting its robust moral reasoning capabilities. It also performs well in General Knowledge (91.8% accuracy) and Email Classification (92.0% accuracy). A significant strength lies in its multimodal capabilities, as described, including state-of-the-art image understanding across resolutions and ratios, and the ability to comprehend videos over 20 minutes. Its multilingual support for text within images is also a notable advantage for global applications. However, the model shows significant weaknesses in Mathematics, scoring 0.0% accuracy, suggesting a current limitation in complex mathematical problem-solving. Its performance in Reasoning (42.0% accuracy) and Instruction Following (51.5% accuracy) is moderate, indicating areas for potential improvement. While its Hallucinations accuracy is 86.0%, this places it in the 29th percentile, suggesting room for improvement in acknowledging uncertainty for fictional concepts.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.2 |
Completion | $0.2 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
Hyperbolic
|
Hyperbolic | qwen/qwen-2-vl-7b-instruct | 32K | $0.2 / 1M tokens | $0.2 / 1M tokens |
InferenceNet
|
InferenceNet | qwen/qwen-2-vl-7b-instruct | 128K | $0.2 / 1M tokens | $0.2 / 1M tokens |
Kluster
|
Kluster | qwen/qwen-2-vl-7b-instruct | 32K | $0.2 / 1M tokens | $0.2 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by qwen
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
Qwen: Qwen3 VL 235B A22B Thinking | Sep 23, 2025 | 235B | 131K |
Text input
Image input
Text output
|
★ | ★ | $$$$$ |
Qwen: Qwen3 VL 235B A22B Instruct | Sep 23, 2025 | 235B | 131K |
Text input
Image input
Text output
|
★★★ | ★★★★★ | $$$ |
Qwen: Qwen3 Max | Sep 23, 2025 | — | 256K |
Text input
Text output
|
★★★★ | ★★★★★ | $$$$ |
Qwen: Qwen3 Coder Plus | Sep 23, 2025 | ~480B | 128K |
Text input
Text output
|
★★★★ | ★★★★ | $$$$ |
Qwen: Qwen3 Coder Flash | Sep 17, 2025 | — | 128K |
Text input
Text output
|
★★★★ | ★★★ | $$$ |
Qwen: Qwen3 Next 80B A3B Thinking | Sep 11, 2025 | 80B | 262K |
Text input
Text output
|
★ | ★★★★ | $$$$$ |
Qwen: Qwen3 Next 80B A3B Instruct | Sep 11, 2025 | 80B | 262K |
Text input
Text output
|
★★★★ | ★★★★★ | $$$$ |
Qwen: Qwen Plus 0728 | Sep 08, 2025 | ~20B | 1M |
Text input
Text output
|
★★★★★ | ★★★ | $$$ |
Qwen: Qwen3 30B A3B Thinking 2507 | Aug 28, 2025 | 30B | 262K |
Text input
Text output
|
★★ | ★★★ | $$$$ |
Qwen: Qwen3 Coder 30B A3B Instruct | Jul 31, 2025 | 30B | 262K |
Text input
Text output
|
★★★★ | ★★★ | $$ |
Qwen: Qwen3 30B A3B Instruct 2507 | Jul 29, 2025 | 30B | 131K |
Text input
Text output
|
★★★★ | ★★★★ | $$$ |
Qwen: Qwen3 235B A22B Thinking 2507 | Jul 25, 2025 | 235B | 131K |
Text input
Text output
|
★ | ★★★★ | $$$$$ |
Qwen: Qwen3 Coder 480B A35B | Jul 22, 2025 | 480B | 1M |
Text input
Text output
|
★ | ★★★ | $$$ |
Qwen: Qwen3 235B A22B Instruct 2507 | Jul 21, 2025 | 235B | 262K |
Text input
Text output
|
★★ | ★★★ | $$$ |
Qwen: Qwen3 30B A3B | Apr 28, 2025 | 30B | 40K |
Text input
Text output
|
★ | ★★★★★ | $$$$ |
Qwen: Qwen3 8B | Apr 28, 2025 | 8B | 128K |
Text input
Text output
|
★ | ★★★ | $$$ |
Qwen: Qwen3 14B | Apr 28, 2025 | 14B | 40K |
Text input
Text output
|
★★ | ★★★★ | $$$ |
Qwen: Qwen3 32B | Apr 28, 2025 | 32B | 40K |
Text input
Text output
|
★ | ★★★★★ | $$$ |
Qwen: Qwen3 235B A22B | Apr 28, 2025 | 235B | 40K |
Text input
Text output
|
★ | ★★★★ | $$$$ |
Qwen: Qwen2.5 VL 32B Instruct | Mar 24, 2025 | 32B | 128K |
Text input
Image input
Text output
|
★ | ★★★ | $$$ |
Qwen: QwQ 32B | Mar 05, 2025 | 32B | 131K |
Text input
Text output
|
★ | ★★★ | $$$ |
Qwen: Qwen VL Plus | Feb 04, 2025 | — | 7K |
Text input
Image input
Text output
|
★★★★ | ★★ | $$$ |
Qwen: Qwen VL Max | Feb 01, 2025 | — | 7K |
Text input
Image input
Text output
|
★★★ | ★★★ | $$$$ |
Qwen: Qwen-Turbo | Feb 01, 2025 | — | 1M |
Text input
Text output
|
★★★★★ | ★★★★ | $$ |
Qwen: Qwen2.5 VL 72B Instruct | Feb 01, 2025 | 72B | 32K |
Text input
Image input
Text output
|
★★★★ | ★★★★ | $$ |
Qwen: Qwen-Plus | Feb 01, 2025 | — | 131K |
Text input
Text output
|
★★★★ | ★★★★ | $$$ |
Qwen: Qwen-Max | Feb 01, 2025 | — | 32K |
Text input
Text output
|
★★★★ | ★★★★ | $$$$ |
Qwen: QwQ 32B Preview Unavailable | Nov 27, 2024 | 32B | 32K |
Text input
Text output
|
— | ★ | $$ |
Qwen2.5 Coder 32B Instruct | Nov 11, 2024 | ~500B | 32K |
Text input
Text output
|
★★★★★ | ★★★★★ | $ |
Qwen: Qwen2.5 7B Instruct | Oct 15, 2024 | ~500B | 32K |
Text input
Text output
|
★ | ★★ | $ |
Qwen2.5 72B Instruct | Sep 18, 2024 | ~500B | 32K |
Text input
Text output
|
★★★ | ★★ | $$ |
Qwen 2 72B Instruct Unavailable | Jun 06, 2024 | ~500B | 32K |
Text input
Text output
|
★★★★ | ★★ | $$$$ |