Author's Description
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Z.AI: GLM 4.5 Air, a lightweight variant of the GLM-4.5 flagship model, demonstrates exceptional speed, consistently ranking among the fastest models with an Infinityth percentile across seven benchmarks. Its pricing is moderate, positioned at the 26th percentile across six benchmarks, offering competitive value. The model exhibits outstanding reliability, achieving a 97% success rate, indicating minimal technical failures and consistent evaluable responses. Performance across benchmarks reveals a mixed but generally strong profile. It excels in General Knowledge and Ethics, achieving 99.5% and 99.0% accuracy respectively, placing it in the 79th and 57th percentiles. Instruction Following shows a solid 68.7% accuracy (79th percentile), though an initial 0.0% accuracy result suggests potential for variability or specific test case issues. Reasoning and Coding benchmarks are respectable at 69.4% (64th percentile) and 84.0% (62nd percentile) accuracy. A notable weakness appears in Email Classification, where its 80.0% accuracy places it in the 9th percentile, indicating an area for improvement. Despite some longer durations on specific benchmarks, its overall speed ranking remains a significant advantage, particularly for agent-centric applications.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.2 |
Completion | $1.1 |
Input Cache Read | $0.03 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
Z.AI
|
Z.AI | z-ai/glm-4.5-air | 131K | $0.2 / 1M tokens | $1.1 / 1M tokens |
DeepInfra
|
DeepInfra | z-ai/glm-4.5-air | 131K | $0.2 / 1M tokens | $1.1 / 1M tokens |
GMICloud
|
GMICloud | z-ai/glm-4.5-air | 131K | $0.2 / 1M tokens | $1.1 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by z-ai
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
Z.AI: GLM 4.5V | Aug 11, 2025 | ~106B | 65K |
Text input
Image input
Text output
|
★★ | ★★★ | $$$$$ |
Z.AI: GLM 4.5 | Jul 25, 2025 | — | 131K |
Text input
Text output
|
★ | ★★★★★ | $$$$$ |
Z.AI: GLM 4 32B | Jul 24, 2025 | 32B | 128K |
Text input
Text output
|
★★★ | ★★ | $$ |