Z.ai: GLM 4.5

Text input Text output
Author's Description

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly...

Key Specifications
Cost
$$$$$
Context
131K
Released
Jul 25, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Top P Reasoning Include Reasoning Response Format Temperature Tools Tool Choice Max Tokens
Features

This model supports the following features:

Tools Reasoning Response Format
Performance Summary

Z.ai's GLM-4.5 model, released on July 25, 2025, demonstrates a strong focus on agent-based applications with its MoE architecture and substantial 131072 context length. While its speed ranking indicates longer response times, placing it in the 13th percentile, its pricing is moderate, falling within the 20th percentile. A standout feature is its exceptional reliability, boasting a 99% success rate across all benchmarks, signifying minimal technical failures. In terms of performance across categories, GLM-4.5 exhibits impressive capabilities in General Knowledge, achieving perfect 100% accuracy and being noted as the most accurate model at its price point and among models of comparable speed. It also performs strongly in Instruction Following (72.7% accuracy, 79th percentile), Coding (92.9% accuracy, 78th percentile), and Mathematics (94.0% accuracy, 80th percentile). Its Reasoning capabilities are also robust at 90.0% accuracy (73rd percentile). A slight weakness appears in Email Classification (93.0% accuracy, 20th percentile) and Hallucinations (94.0% accuracy, 48th percentile), where there's room for improvement compared to its other high-performing areas. The model's "thinking mode" for complex reasoning and tool use, alongside its "non-thinking mode" for instant responses, offers flexible inference options.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.6
Completion $2.2
Input Cache Read $0.11

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Z.AI
Z.AI | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Chutes
Chutes | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
DeepInfra
DeepInfra | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Novita
Novita | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Parasail
Parasail | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
GMICloud
GMICloud | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
AtlasCloud
AtlasCloud | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Mancer 2
Mancer 2 | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
SiliconFlow
SiliconFlow | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
WandB
WandB | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Mancer 2
Mancer 2 | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Nebius
Nebius | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Novita
Novita | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Chutes
Chutes | z-ai/glm-4.5 131K $0.6 / 1M tokens $2.2 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by z-ai