Z.ai: GLM 4.7 Flash

Text input Text output
Author's Description

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.

Key Specifications
Cost
$$$$
Context
202K
Parameters
30B (Rumoured)
Released
Jan 19, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Temperature Max Tokens Top P Tool Choice Response Format Reasoning Include Reasoning
Features

This model supports the following features:

Tools Response Format Reasoning
Performance Summary

Z.AI's GLM-4.7-Flash, a 30B-class model released on January 19, 2026, is positioned as a balanced option for performance and efficiency, particularly optimized for agentic coding. While it demonstrates exceptional reliability with a 97% success rate, indicating consistent operational stability, its speed performance tends to be slower, ranking in the 2nd percentile across benchmarks. Pricing is moderate, falling within the 39th percentile. In terms of specific benchmark results, GLM-4.7-Flash achieved perfect 100.0% accuracy in Email Classification, notably being the most accurate model at its price point and among models of comparable speed. This highlights a significant strength in classification tasks. Its performance on the Ethics benchmark was solid at 98.0% accuracy. However, a notable weakness is its 67.3% accuracy on the Hallucinations (Baseline) test, placing it in the 17th percentile, suggesting room for improvement in acknowledging uncertainty for fictional concepts. The model's duration metrics across all benchmarks indicate longer response times, reinforcing its lower speed ranking.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.06
Completion $0.4
Input Cache Read $0.01

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Z.AI
Z.AI | z-ai/glm-4.7-flash-20260119 200K $0.07 / 1M tokens $0.4 / 1M tokens
Novita
Novita | z-ai/glm-4.7-flash-20260119 200K $0.07 / 1M tokens $0.4 / 1M tokens
Phala
Phala | z-ai/glm-4.7-flash-20260119 202K $0.1 / 1M tokens $0.43 / 1M tokens
DeepInfra
DeepInfra | z-ai/glm-4.7-flash-20260119 202K $0.06 / 1M tokens $0.4 / 1M tokens
Venice
Venice | z-ai/glm-4.7-flash-20260119 128K $0.125 / 1M tokens $0.5 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by z-ai