Author's Description
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Z.AI's GLM-4.5-Air, a lightweight MoE model designed for agent-centric applications, demonstrates exceptional speed, consistently ranking among the fastest models across nine benchmarks. Its pricing is moderate, placing it in the 24th percentile. Reliability is a significant strength, with a 98% success rate indicating minimal technical failures. The model exhibits strong performance in several key areas. It achieves high accuracy in Reasoning (95.6%, 86th percentile), General Knowledge (99.5%, 75th percentile), and Mathematics (92.9%, 74th percentile), showcasing robust analytical and factual recall capabilities. Instruction Following also performs well at 68.7% accuracy (79th percentile). While its Hallucinations accuracy is 90.0%, placing it in the 35th percentile, it generally acknowledges uncertainty appropriately. A notable weakness is its Email Classification accuracy (80.0%, 9th percentile), suggesting room for improvement in nuanced categorization tasks. The model's "thinking mode" for advanced reasoning and tool use, alongside a "non-thinking mode" for real-time interaction, offers flexible deployment options.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.2 |
| Completion | $1.1 |
| Input Cache Read | $0.03 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
Z.AI
|
Z.AI | z-ai/glm-4.5-air | 131K | $0.2 / 1M tokens | $1.1 / 1M tokens |
|
DeepInfra
|
DeepInfra | z-ai/glm-4.5-air | 131K | $0.05 / 1M tokens | $0.22 / 1M tokens |
|
GMICloud
|
GMICloud | z-ai/glm-4.5-air | 131K | $0.05 / 1M tokens | $0.22 / 1M tokens |
|
SiliconFlow
|
SiliconFlow | z-ai/glm-4.5-air | 131K | $0.14 / 1M tokens | $0.86 / 1M tokens |
|
AtlasCloud
|
AtlasCloud | z-ai/glm-4.5-air | 32K | $0.05 / 1M tokens | $0.22 / 1M tokens |
|
Nebius
|
Nebius | z-ai/glm-4.5-air | 131K | $0.2 / 1M tokens | $1.2 / 1M tokens |
|
Novita
|
Novita | z-ai/glm-4.5-air | 131K | $0.13 / 1M tokens | $0.85 / 1M tokens |
|
Chutes
|
Chutes | z-ai/glm-4.5-air | 131K | $0.05 / 1M tokens | $0.22 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|
Other Models by z-ai
|
|
Released | Params | Context |
|
Speed | Ability | Cost |
|---|---|---|---|---|---|---|---|
| Z.AI: GLM 4.7 | Dec 21, 2025 | — | 200K |
Text input
Text output
|
★ | ★★★★★ | $$$$$ |
| Z.AI: GLM 4.6V | Dec 08, 2025 | — | 131K |
Image input
Video input
Text input
Text output
|
★★ | ★★★★★ | $$$$ |
| Z.AI: GLM 4.6 | Sep 30, 2025 | — | 200K |
Text input
Text output
|
★ | ★★★ | $$$$$ |
| Z.AI: GLM 4.6 (exacto) | Sep 30, 2025 | — | 202K |
Text input
Text output
|
— | — | $$$$ |
| Z.AI: GLM 4.5V | Aug 11, 2025 | ~106B | 65K |
Image input
Text input
Text output
|
★★ | ★★★ | $$$$$ |
| Z.AI: GLM 4.5 | Jul 25, 2025 | — | 131K |
Text input
Text output
|
★ | ★★★★ | $$$$$ |
| Z.AI: GLM 4 32B | Jul 24, 2025 | 32B | 128K |
Text input
Text output
|
★★★ | ★ | $$ |