Z.ai: GLM 4.7 Flash

Text input Text output
Author's Description

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...

Key Specifications
Cost
$$$$
Context
202K
Parameters
30B (Rumoured)
Released
Jan 19, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Tool Choice Temperature Include Reasoning Reasoning Max Tokens Response Format Top P
Features

This model supports the following features:

Reasoning Response Format Tools
Performance Summary

Z.ai's GLM-4.7-Flash, a 30B-class model created on January 19, 2026, is designed to balance performance and efficiency, particularly for agentic coding use cases. With a substantial context length of 200,000, it demonstrates strong reliability, achieving a 93% success rate across benchmarks. However, the model exhibits a significant weakness in speed, consistently showing longer response times, ranking in the 1st percentile. Its pricing is moderate, falling within the 38th percentile. In terms of performance across categories, GLM-4.7-Flash excels in Email Classification, achieving perfect 100% accuracy, making it the most accurate model at its price point and speed. It also shows strong capabilities in Coding (90.7% accuracy) and Ethics (98.0% accuracy). While its General Knowledge (96.5%) and Reasoning (78.0%) are respectable, its performance in Hallucinations (67.3%) and Mathematics (74.7%) is less competitive, ranking in the lower percentiles. The model's instruction following is average at 61.5%. Overall, GLM-4.7-Flash is a reliable model with notable strengths in classification and coding, but its slow processing speed and moderate hallucination and math scores are areas for potential improvement.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.43

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Z.AI
Z.AI | z-ai/glm-4.7-flash-20260119 200K $0.07 / 1M tokens $0.4 / 1M tokens
Novita
Novita | z-ai/glm-4.7-flash-20260119 200K $0.07 / 1M tokens $0.4 / 1M tokens
Phala
Phala | z-ai/glm-4.7-flash-20260119 202K $0.1 / 1M tokens $0.43 / 1M tokens
DeepInfra
DeepInfra | z-ai/glm-4.7-flash-20260119 202K $0.06 / 1M tokens $0.4 / 1M tokens
Venice
Venice | z-ai/glm-4.7-flash-20260119 128K $0.125 / 1M tokens $0.5 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by z-ai