Z.AI: GLM 4.5

Text input Text output
Author's Description

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly enhanced capabilities in reasoning, code generation, and agent alignment. It supports a hybrid inference mode with two options, a "thinking mode" designed for complex reasoning and tool use, and a "non-thinking mode" optimized for instant responses. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

Key Specifications
Cost
$$$$$
Context
131K
Released
Jul 25, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Tool Choice Top P Temperature Tools Response Format Reasoning Max Tokens
Features

This model supports the following features:

Tools Reasoning Response Format
Performance Summary

Z.AI's GLM 4.5, a flagship foundation model designed for agent-based applications, demonstrates exceptional reliability, achieving a perfect 100th percentile in providing usable responses with minimal technical failures. However, its performance is characterized by longer response times, ranking in the 8th percentile for speed across benchmarks, and it is positioned at premium pricing levels, falling into the 20th percentile for cost competitiveness. In terms of benchmark performance, GLM 4.5 excels in General Knowledge, achieving 100% accuracy, making it the most accurate model at its price point and among models of comparable speed. It also shows strong capabilities in Instruction Following and Reasoning, with accuracies of 72.7% (89th percentile) and 82.0% (80th percentile) respectively, aligning with its purpose for complex reasoning and tool use. While its Ethics accuracy is moderate at 99.0% (50th percentile), its Email Classification accuracy is lower at 93.0% (25th percentile). The model's "thinking mode" for complex reasoning likely contributes to its higher accuracy in these areas, but also to its extended durations across all tests, particularly in Reasoning (3,708,952ms). Its primary strengths lie in its robust reasoning and knowledge capabilities, coupled with unparalleled reliability, though these come at the expense of speed and cost.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.6
Completion $2.2
Input Cache Read $0.11

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Z.AI
Z.AI | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Chutes
Chutes | z-ai/glm-4.5 131K $0.2 / 1M tokens $0.8 / 1M tokens
DeepInfra
DeepInfra | z-ai/glm-4.5 131K $0.55 / 1M tokens $2 / 1M tokens
Novita
Novita | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Parasail
Parasail | z-ai/glm-4.5 131K $0.59 / 1M tokens $2.1 / 1M tokens
GMICloud
GMICloud | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
AtlasCloud
AtlasCloud | z-ai/glm-4.5 131K $0.2 / 1M tokens $0.8 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by z-ai