Qwen: Qwen3 32B

Text input Text output
Author's Description

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, coding, and logical inference, and a "non-thinking" mode for faster, general-purpose conversation. The model demonstrates strong performance in instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles 32K token contexts and can extend to 131K tokens using YaRN-based scaling.

Key Specifications
Cost
$$$
Context
40K
Parameters
32B
Released
Apr 28, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tool Choice Reasoning Include Reasoning Response Format Seed Top P Temperature Tools Stop Min P Max Tokens Frequency Penalty Presence Penalty
Features

This model supports the following features:

Tools Reasoning Response Format
Performance Summary

Qwen3-32B demonstrates exceptional reliability, achieving a 100% success rate across all benchmarks, indicating a highly stable and dependable model. While its speed performance tends to be slower, ranking in the 11th percentile, it offers competitive pricing, placing in the 53rd percentile. The model exhibits strong performance in several key areas. It achieves perfect accuracy in General Knowledge, Reasoning, and Ethics, highlighting its robust understanding and logical inference capabilities. Its Coding and Mathematics scores are also very impressive, ranking in the 96th and 98th percentiles respectively, showcasing its proficiency in complex problem-solving. Instruction Following is solid at 55.7% accuracy, and Email Classification is strong at 99.0%. A notable area for improvement is its hallucination rate, with 90.0% accuracy, placing it in the 36th percentile, suggesting it occasionally struggles to appropriately acknowledge uncertainty. Overall, Qwen3-32B is a highly reliable and accurate model, particularly strong in complex reasoning, coding, and mathematical tasks, making it well-suited for applications requiring precision and deep understanding, despite its longer response times.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.28

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | qwen/qwen3-32b-04-28 40K $0.1 / 1M tokens $0.28 / 1M tokens
Nebius
Nebius | qwen/qwen3-32b-04-28 40K $0.1 / 1M tokens $0.3 / 1M tokens
Lambda
Lambda | qwen/qwen3-32b-04-28 40K $0.03 / 1M tokens $0.13 / 1M tokens
Novita
Novita | qwen/qwen3-32b-04-28 40K $0.1 / 1M tokens $0.45 / 1M tokens
Parasail
Parasail | qwen/qwen3-32b-04-28 40K $0.03 / 1M tokens $0.13 / 1M tokens
GMICloud
GMICloud | qwen/qwen3-32b-04-28 32K $0.1 / 1M tokens $0.6 / 1M tokens
Nebius
Nebius | qwen/qwen3-32b-04-28 40K $0.2 / 1M tokens $0.6 / 1M tokens
Cerebras
Cerebras | qwen/qwen3-32b-04-28 131K $0.4 / 1M tokens $0.8 / 1M tokens
SambaNova
SambaNova | qwen/qwen3-32b-04-28 32K $0.4 / 1M tokens $0.8 / 1M tokens
Groq
Groq | qwen/qwen3-32b-04-28 131K $0.29 / 1M tokens $0.59 / 1M tokens
Friendli
Friendli | qwen/qwen3-32b-04-28 131K $0.15 / 1M tokens $0.5 / 1M tokens
Chutes
Chutes | qwen/qwen3-32b-04-28 40K $0.03 / 1M tokens $0.13 / 1M tokens
NCompass
NCompass | qwen/qwen3-32b-04-28 40K $0.1 / 1M tokens $0.28 / 1M tokens
SiliconFlow
SiliconFlow | qwen/qwen3-32b-04-28 131K $0.14 / 1M tokens $0.57 / 1M tokens
Chutes
Chutes | qwen/qwen3-32b-04-28 40K $0.03 / 1M tokens $0.13 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by qwen