Qwen: Qwen3 VL 235B A22B Instruct

Image input Text input Text output
Author's Description

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language use (VQA, document parsing, chart/table extraction, multilingual OCR). The series emphasizes robust perception (recognition of diverse real-world and synthetic categories), spatial understanding (2D/3D grounding), and long-form visual comprehension, with competitive results on public multimodal benchmarks for both perception and reasoning. Beyond analysis, Qwen3-VL supports agentic interaction and tool use: it can follow complex instructions over multi-image, multi-turn dialogues; align text to video timelines for precise temporal queries; and operate GUI elements for automation tasks. The models also enable visual coding workflows—turning sketches or mockups into code and assisting with UI debugging—while maintaining strong text-only performance comparable to the flagship Qwen3 language models. This makes Qwen3-VL suitable for production scenarios spanning document AI, multilingual OCR, software/UI assistance, spatial/embodied tasks, and research on vision-language agents.

Key Specifications
Cost
$$$
Context
131K
Parameters
235B
Released
Sep 23, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Presence Penalty Top P Response Format Tool Choice Temperature Seed Structured Outputs Max Tokens
Features

This model supports the following features:

Response Format Structured Outputs Tools
Performance Summary

Qwen3-VL-235B-A22B Instruct demonstrates competitive response times, performing among the faster models with a 55th percentile speed ranking. It also offers competitive pricing, ranking in the 58th percentile. Notably, the model exhibits exceptional reliability, achieving a 100% success rate across all benchmarks, indicating minimal technical failures. The model excels in several critical areas, achieving perfect accuracy in Hallucinations (100%), General Knowledge (100%), Reasoning (100%), and Ethics (100%). Its performance in these categories is often among the most accurate at its price point and speed. It also shows strong capabilities in Mathematics (92.3% accuracy) and Email Classification (98.0% accuracy). While its Instruction Following (65.7% accuracy) and Coding (80.0% accuracy) scores are respectable, they represent areas with potential for further improvement compared to its top-tier performance in other domains. Overall, Qwen3-VL-235B-A22B Instruct is a robust multimodal model with a strong foundation in perception, reasoning, and ethical understanding, making it suitable for diverse applications.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.7
Completion $2.8

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Alibaba
Alibaba | qwen/qwen3-vl-235b-a22b-instruct 131K $0.7 / 1M tokens $2.8 / 1M tokens
Novita
Novita | qwen/qwen3-vl-235b-a22b-instruct 131K $0.22 / 1M tokens $0.88 / 1M tokens
Parasail
Parasail | qwen/qwen3-vl-235b-a22b-instruct 262K $0.22 / 1M tokens $0.88 / 1M tokens
Chutes
Chutes | qwen/qwen3-vl-235b-a22b-instruct 131K $0.22 / 1M tokens $0.88 / 1M tokens
Parasail
Parasail | qwen/qwen3-vl-235b-a22b-instruct 131K $0.5 / 1M tokens $2.75 / 1M tokens
SiliconFlow
SiliconFlow | qwen/qwen3-vl-235b-a22b-instruct 262K $0.3 / 1M tokens $1.5 / 1M tokens
DeepInfra
DeepInfra | qwen/qwen3-vl-235b-a22b-instruct 131K $0.22 / 1M tokens $0.88 / 1M tokens
DeepInfra
DeepInfra | qwen/qwen3-vl-235b-a22b-instruct 262K $0.3 / 1M tokens $1.49 / 1M tokens
Novita
Novita | qwen/qwen3-vl-235b-a22b-instruct 131K $0.3 / 1M tokens $1.5 / 1M tokens
Chutes
Chutes | qwen/qwen3-vl-235b-a22b-instruct 262K $0.3 / 1M tokens $1.2 / 1M tokens
Phala
Phala | qwen/qwen3-vl-235b-a22b-instruct 131K $0.22 / 1M tokens $0.88 / 1M tokens
Fireworks
Fireworks | qwen/qwen3-vl-235b-a22b-instruct 262K $0.22 / 1M tokens $0.88 / 1M tokens
AtlasCloud
AtlasCloud | qwen/qwen3-vl-235b-a22b-instruct 131K $0.3 / 1M tokens $1.5 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by qwen