Qwen: Qwen3.5-Flash

Image input Video input Text input Text output
Author's Description

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

Key Specifications
Cost
$$$$
Context
1M
Released
Feb 25, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Response Format Tool Choice Structured Outputs Top P Seed Temperature Tools Reasoning Max Tokens Include Reasoning Presence Penalty
Features

This model supports the following features:

Reasoning Structured Outputs Tools Response Format
Performance Summary

Qwen3.5-Flash, released on February 25, 2026, is a Qwen model designed for high inference efficiency through its hybrid architecture integrating linear attention and a sparse mixture-of-experts model. This model consistently ranks among the fastest available, demonstrating exceptional speed across eight benchmarks. It also offers highly competitive pricing, placing it among the most cost-effective options across four benchmarks. With an 88% success rate across eight benchmarks, Qwen3.5-Flash exhibits strong reliability, consistently providing usable responses. In terms of performance across categories, the model shows notable strengths in hallucination mitigation, achieving 97.1% accuracy, and email classification with 97.4% accuracy. Its instruction following capabilities are moderate at 63.6% accuracy. However, a significant weakness is apparent in core academic and reasoning tasks, where it scored 0.0% accuracy in Coding, General Knowledge, Reasoning, Ethics, and Mathematics benchmarks. This suggests that while Qwen3.5-Flash excels in efficiency and specific text-based tasks, its capabilities in complex problem-solving and knowledge-intensive domains are currently limited.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.065
Completion $0.26

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Alibaba
Alibaba | qwen/qwen3.5-flash-20260224 1M $0.065 / 1M tokens $0.26 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by qwen