Qwen: Qwen3.5-Flash

Image input Text input Video input Text output
Author's Description

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the 3 series, these models deliver a leap forward in performance for both pure text and multimodal tasks, offering fast response times while balancing inference speed and overall performance.

Key Specifications
Cost
$$$$
Context
1M
Released
Feb 25, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Presence Penalty Tools Response Format Structured Outputs Temperature Top P Tool Choice Max Tokens Seed Reasoning Include Reasoning
Features

This model supports the following features:

Reasoning Structured Outputs Tools Response Format
Performance Summary

Qwen: Qwen3.5-Flash, created on Feb 25, 2026, demonstrates exceptional speed, consistently ranking among the fastest models, and offers highly competitive pricing. Its reliability is strong, with an 88% success rate across benchmarks. In terms of performance, the model shows particular strength in acknowledging uncertainty, achieving 97.1% accuracy in Hallucinations (Baseline) tests, and in Email Classification, where it reached 97.4% accuracy. Instruction Following also yielded a respectable 63.6% accuracy. However, the model exhibits significant weaknesses in more complex cognitive domains, scoring 0.0% accuracy across Coding, General Knowledge, Reasoning, Ethics, and Mathematics benchmarks. This suggests a strong capability for specific, well-defined tasks and efficient processing, but a current limitation in handling open-ended, knowledge-intensive, or complex problem-solving scenarios. Its hybrid architecture and linear attention mechanism appear to contribute to its high inference efficiency.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.4

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Alibaba
Alibaba | qwen/qwen3.5-flash-20260224 1M $0.1 / 1M tokens $0.4 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by qwen