Qwen: Qwen3 8B

Text input Text output
Author's Description

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math, coding, and logical inference, and "non-thinking" mode for general conversation. The model is fine-tuned for instruction-following, agent integration, creative writing, and multilingual use across 100+ languages and dialects. It natively supports a 32K token context window and can extend to 131K tokens with YaRN scaling.

Key Specifications
Cost
$$$
Context
128K
Parameters
8B
Released
Apr 28, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Stop Presence Penalty Logit Bias Top P Temperature Seed Min P Reasoning Frequency Penalty Max Tokens
Features

This model supports the following features:

Reasoning
Performance Summary

Qwen3-8B, a dense 8.2B parameter model, demonstrates strong reliability, consistently providing evaluable responses with few technical issues, ranking in the 87th percentile. While it offers cost-effective solutions, typically providing competitive pricing (68th percentile), its speed performance tends to be slower, ranking in the 14th percentile across benchmarks. In terms of specific performance, Qwen3-8B excels in Ethics (99.0% accuracy) and Email Classification (98.0% accuracy), indicating strong capabilities in understanding ethical principles and categorizing information. Its General Knowledge is also robust at 97.5% accuracy. However, the model shows notable weaknesses in Coding, achieving only 34.0% accuracy, placing it in the 23rd percentile. Its Instruction Following accuracy is moderate at 60.0%, and Reasoning is at 50.0%. The model's longer response times are evident across most benchmarks, particularly in Instruction Following and Coding, where durations are in the 4th percentile. Overall, Qwen3-8B is a reliable and cost-efficient option for tasks requiring strong ethical understanding and classification, but users should account for its slower processing speed and consider its limitations in complex coding and reasoning tasks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.035
Completion $0.138

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Novita
Novita | qwen/qwen3-8b-04-28 128K $0.035 / 1M tokens $0.138 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by qwen