Z.AI: GLM 4.5 Air

Text input Text output Free Option
Author's Description

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

Key Specifications
Cost
$$$$$
Context
131K
Released
Jul 25, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Tool Choice Top P Temperature Tools Reasoning Max Tokens
Features

This model supports the following features:

Tools Reasoning
Performance Summary

Z.AI: GLM 4.5 Air, a lightweight variant of the GLM-4.5 flagship model, demonstrates exceptional speed, consistently ranking among the fastest models with an Infinityth percentile across seven benchmarks. Its pricing is moderate, positioned at the 26th percentile across six benchmarks, offering competitive value. The model exhibits outstanding reliability, achieving a 97% success rate, indicating minimal technical failures and consistent evaluable responses. Performance across benchmarks reveals a mixed but generally strong profile. It excels in General Knowledge and Ethics, achieving 99.5% and 99.0% accuracy respectively, placing it in the 79th and 57th percentiles. Instruction Following shows a solid 68.7% accuracy (79th percentile), though an initial 0.0% accuracy result suggests potential for variability or specific test case issues. Reasoning and Coding benchmarks are respectable at 69.4% (64th percentile) and 84.0% (62nd percentile) accuracy. A notable weakness appears in Email Classification, where its 80.0% accuracy places it in the 9th percentile, indicating an area for improvement. Despite some longer durations on specific benchmarks, its overall speed ranking remains a significant advantage, particularly for agent-centric applications.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.2
Completion $1.1
Input Cache Read $0.03

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Z.AI
Z.AI | z-ai/glm-4.5-air 131K $0.2 / 1M tokens $1.1 / 1M tokens
DeepInfra
DeepInfra | z-ai/glm-4.5-air 131K $0.2 / 1M tokens $1.1 / 1M tokens
GMICloud
GMICloud | z-ai/glm-4.5-air 131K $0.2 / 1M tokens $1.1 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by z-ai