Qwen: Qwen3 32B

Text input Text output
Author's Description

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...

Key Specifications
Cost
$$$
Context
40K
Parameters
32B
Released
Apr 28, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Seed Min P Response Format Reasoning Temperature Presence Penalty Include Reasoning Tools Frequency Penalty Top P Stop Tool Choice Max Tokens
Features

This model supports the following features:

Response Format Tools Reasoning
Performance Summary

Qwen3-32B, a 32.8B parameter causal language model, demonstrates strong performance across various benchmarks, particularly excelling in accuracy for complex tasks. While its speed ranking places it in the 14th percentile, indicating generally longer response times, it offers competitive pricing, ranking in the 56th percentile. A standout feature is its exceptional reliability, achieving a 100% success rate across all evaluated benchmarks, signifying consistent and dependable operation. The model exhibits perfect accuracy in General Knowledge, Reasoning, and Ethics, often being the most accurate model at its price point and speed. It also shows high proficiency in Coding (95.0% accuracy, 93rd percentile) and Mathematics (95.9% accuracy, 91st percentile), aligning with its optimization for complex reasoning. Instruction Following is solid at 55.7% accuracy, while its ability to acknowledge uncertainty in the Hallucinations benchmark is fair at 90.0%. Email Classification is strong at 99.0%. Overall, Qwen3-32B's key strengths lie in its high accuracy for knowledge-based, logical, and ethical reasoning tasks, coupled with robust reliability, making it a powerful tool despite its slower processing speed.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.08
Completion $0.28

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | qwen/qwen3-32b-04-28 40K $0.08 / 1M tokens $0.28 / 1M tokens
Nebius
Nebius | qwen/qwen3-32b-04-28 40K $0.1 / 1M tokens $0.3 / 1M tokens
Lambda
Lambda | qwen/qwen3-32b-04-28 40K $0.08 / 1M tokens $0.24 / 1M tokens
Novita
Novita | qwen/qwen3-32b-04-28 40K $0.1 / 1M tokens $0.45 / 1M tokens
Parasail
Parasail | qwen/qwen3-32b-04-28 40K $0.08 / 1M tokens $0.24 / 1M tokens
GMICloud
GMICloud | qwen/qwen3-32b-04-28 32K $0.08 / 1M tokens $0.24 / 1M tokens
Nebius
Nebius | qwen/qwen3-32b-04-28 40K $0.08 / 1M tokens $0.24 / 1M tokens
Cerebras
Cerebras | qwen/qwen3-32b-04-28 131K $0.08 / 1M tokens $0.24 / 1M tokens
SambaNova
SambaNova | qwen/qwen3-32b-04-28 32K $0.08 / 1M tokens $0.24 / 1M tokens
Groq
Groq | qwen/qwen3-32b-04-28 131K $0.29 / 1M tokens $0.59 / 1M tokens
Friendli
Friendli | qwen/qwen3-32b-04-28 131K $0.08 / 1M tokens $0.24 / 1M tokens
Chutes
Chutes | qwen/qwen3-32b-04-28 40K $0.08 / 1M tokens $0.24 / 1M tokens
NCompass
NCompass | qwen/qwen3-32b-04-28 40K $0.08 / 1M tokens $0.24 / 1M tokens
SiliconFlow
SiliconFlow | qwen/qwen3-32b-04-28 131K $0.14 / 1M tokens $0.57 / 1M tokens
Chutes
Chutes | qwen/qwen3-32b-04-28 40K $0.08 / 1M tokens $0.24 / 1M tokens
AtlasCloud
AtlasCloud | qwen/qwen3-32b-04-28 40K $0.1 / 1M tokens $1.2 / 1M tokens
Cerebras
Cerebras | qwen/qwen3-32b-04-28 131K $0.08 / 1M tokens $0.24 / 1M tokens
Alibaba
Alibaba | qwen/qwen3-32b-04-28 131K $0.104 / 1M tokens $0.416 / 1M tokens
Chutes
Chutes | qwen/qwen3-32b-04-28 40K $0.08 / 1M tokens $0.24 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by qwen