Z.ai: GLM 4.6

Text input Text output
Author's Description

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex...

Key Specifications
Cost
$$$$$
Context
202K
Released
Sep 30, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Top P Reasoning Include Reasoning Temperature Tools Tool Choice Max Tokens
Features

This model supports the following features:

Tools Reasoning
Performance Summary

Z.ai: GLM 4.6, released on September 30, 2025, demonstrates a significant expansion in context length to 200K tokens, enabling more complex agentic tasks. While the model tends to have longer response times, ranking in the 12th percentile for speed, and is positioned at premium pricing levels (14th percentile), it exhibits exceptional reliability with a 97% success rate across benchmarks. GLM 4.6 showcases strong performance in several key areas. It achieves perfect accuracy in both General Knowledge and Email Classification, with the latter also being the most accurate and fastest among models at its price point. Its coding performance is a notable strength, scoring 95.0% accuracy (92nd percentile) and demonstrating real-world improvements in various coding applications. Mathematics also stands out with 96.0% accuracy (93rd percentile). Reasoning performance is robust at 93.9% accuracy (78th percentile), supporting its claim of advanced reasoning and tool use. However, the model shows weaknesses in Instruction Following, with a low 3.1% accuracy (18th percentile), indicating challenges with complex multi-step directives. Its Ethics performance is also relatively low at 82.0% accuracy (18th percentile). Hallucinations are well-managed at 96.0% accuracy, suggesting a good ability to acknowledge uncertainty. Overall, GLM 4.6 excels in knowledge-based and coding tasks, but users should be mindful of its instruction following and ethical reasoning capabilities, as well as its slower response times and premium cost.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.39
Completion $1.9

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Z.AI
Z.AI | z-ai/glm-4.6 200K $0.6 / 1M tokens $2.2 / 1M tokens
Parasail
Parasail | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
DeepInfra
DeepInfra | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
Chutes
Chutes | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
GMICloud
GMICloud | z-ai/glm-4.6 204K $0.39 / 1M tokens $1.9 / 1M tokens
Novita
Novita | z-ai/glm-4.6 204K $0.39 / 1M tokens $1.9 / 1M tokens
SiliconFlow
SiliconFlow | z-ai/glm-4.6 204K $0.39 / 1M tokens $1.9 / 1M tokens
AtlasCloud
AtlasCloud | z-ai/glm-4.6 202K $0.6 / 1M tokens $2.2 / 1M tokens
Mancer 2
Mancer 2 | z-ai/glm-4.6 131K $0.39 / 1M tokens $1.9 / 1M tokens
Novita
Novita | z-ai/glm-4.6 204K $0.55 / 1M tokens $2.2 / 1M tokens
BaseTen
BaseTen | z-ai/glm-4.6 200K $0.39 / 1M tokens $1.9 / 1M tokens
Fireworks
Fireworks | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
Chutes
Chutes | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
DeepInfra
DeepInfra | z-ai/glm-4.6 202K $0.43 / 1M tokens $1.74 / 1M tokens
Friendli
Friendli | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
Cerebras
Cerebras | z-ai/glm-4.6 131K $0.39 / 1M tokens $1.9 / 1M tokens
Together
Together | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
Avian
Avian | z-ai/glm-4.6 204K $0.39 / 1M tokens $1.9 / 1M tokens
Avian
Avian | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
Ambient
Ambient | z-ai/glm-4.6 202K $0.39 / 1M tokens $1.9 / 1M tokens
Venice
Venice | z-ai/glm-4.6 198K $0.85 / 1M tokens $2.75 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by z-ai