Qwen: Qwen3 235B A22B Thinking 2507

Text input Text output
Author's Description

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass and natively supports up to 262,144 tokens of context. This "thinking-only" variant enhances structured logical reasoning, mathematics, science, and long-form generation, showing strong benchmark performance across AIME, SuperGPQA, LiveCodeBench, and MMLU-Redux. It enforces a special reasoning mode (</think>) and is designed for high-token outputs (up to 81,920 tokens) in challenging domains. The model is instruction-tuned and excels at step-by-step reasoning, tool use, agentic workflows, and multilingual tasks. This release represents the most capable open-source variant in the Qwen3-235B series, surpassing many closed models in structured reasoning use cases.

Key Specifications
Cost
$$$$$
Context
131K
Parameters
235B
Released
Jul 25, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Tool Choice Reasoning Include Reasoning Response Format Seed Top P Max Tokens Temperature Presence Penalty
Features

This model supports the following features:

Tools Reasoning Response Format
Performance Summary

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) model optimized for complex reasoning. While it tends to have longer response times, ranking in the 12th percentile for speed, and is positioned at premium pricing levels (5th percentile), its reliability is exceptional, demonstrating a 100% success rate across all benchmarks. The model exhibits outstanding performance in specialized domains. It achieved perfect accuracy in General Knowledge and near-perfect scores in Coding (98.0%, top 3 in accuracy) and Reasoning (98.0%, 90th percentile), aligning with its "thinking-only" design. Mathematics also shows strong results at 92.9% accuracy (75th percentile). Its ability to handle complex instructions is a notable weakness, with only 26.3% accuracy in Instruction Following. Hallucination rates are low at 94.0% accuracy, indicating a good grasp of uncertainty. Classification tasks, such as Keyword Topic Relevance (90.0%) and Email Classification (99.0%), are handled competently. The model's strengths lie in structured logical reasoning, mathematics, science, and long-form generation, making it highly suitable for agentic workflows and tasks requiring high-token outputs.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.7
Completion $8.4

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Alibaba
Alibaba | qwen/qwen3-235b-a22b-thinking-2507 131K $0.7 / 1M tokens $8.4 / 1M tokens
Novita
Novita | qwen/qwen3-235b-a22b-thinking-2507 131K $0.11 / 1M tokens $0.6 / 1M tokens
Chutes
Chutes | qwen/qwen3-235b-a22b-thinking-2507 262K $0.11 / 1M tokens $0.6 / 1M tokens
Novita
Novita | qwen/qwen3-235b-a22b-thinking-2507 131K $0.3 / 1M tokens $3 / 1M tokens
DeepInfra
DeepInfra | qwen/qwen3-235b-a22b-thinking-2507 262K $0.3 / 1M tokens $2.9 / 1M tokens
Parasail
Parasail | qwen/qwen3-235b-a22b-thinking-2507 262K $0.11 / 1M tokens $0.6 / 1M tokens
Together
Together | qwen/qwen3-235b-a22b-thinking-2507 262K $0.65 / 1M tokens $3 / 1M tokens
Crusoe
Crusoe | qwen/qwen3-235b-a22b-thinking-2507 262K $0.11 / 1M tokens $0.6 / 1M tokens
Cerebras
Cerebras | qwen/qwen3-235b-a22b-thinking-2507 131K $0.6 / 1M tokens $2.9 / 1M tokens
GMICloud
GMICloud | qwen/qwen3-235b-a22b-thinking-2507 131K $0.6 / 1M tokens $3 / 1M tokens
SiliconFlow
SiliconFlow | qwen/qwen3-235b-a22b-thinking-2507 262K $0.13 / 1M tokens $0.6 / 1M tokens
Chutes
Chutes | qwen/qwen3-235b-a22b-thinking-2507 262K $0.11 / 1M tokens $0.6 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by qwen