Qwen: Qwen3 8B

Text input Text output
Author's Description

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math,...

Key Specifications
Cost
$$$
Context
128K
Parameters
8B
Released
Apr 28, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Seed Frequency Penalty Top P Reasoning Temperature Stop Presence Penalty Include Reasoning Max Tokens
Features

This model supports the following features:

Reasoning
Performance Summary

Qwen3-8B, a dense 8.2B parameter causal language model, demonstrates strong reliability with an 89% success rate across benchmarks, consistently providing usable responses. While it tends to have longer response times, ranking in the 18th percentile for speed, it offers cost-effective solutions, placing in the 69th percentile for price. The model exhibits notable strengths in specific areas. It achieves high accuracy in Email Classification (98.0%) and Ethics (99.0%), indicating strong performance in understanding context and adhering to ethical principles. Its General Knowledge is also robust at 97.5% accuracy. However, Qwen3-8B shows weaknesses in more complex, generative tasks. Its Coding accuracy is low at 34.0%, and Reasoning tasks yield a 56.0% accuracy. Instruction Following is moderate at 60.0%, and Mathematics at 85.0%. The model's "thinking" mode for math, coding, and logical inference appears to require further optimization to match its strong performance in classification and knowledge-based tasks. Its ability to switch between "thinking" and "non-thinking" modes, coupled with multilingual support and a large context window, positions it as a versatile tool despite its current performance disparities.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.05
Completion $0.4
Input Cache Read $0.05

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Novita
Novita | qwen/qwen3-8b-04-28 128K $0.05 / 1M tokens $0.4 / 1M tokens
Chutes
Chutes | qwen/qwen3-8b-04-28 40K $0.05 / 1M tokens $0.4 / 1M tokens
Fireworks
Fireworks | qwen/qwen3-8b-04-28 40K $0.05 / 1M tokens $0.4 / 1M tokens
AtlasCloud
AtlasCloud | qwen/qwen3-8b-04-28 40K $0.05 / 1M tokens $0.4 / 1M tokens
Alibaba
Alibaba | qwen/qwen3-8b-04-28 131K $0.117 / 1M tokens $0.455 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by qwen