THUDM: GLM Z1 32B

Text input Text output
Author's Description

GLM-Z1-32B-0414 is an enhanced reasoning variant of GLM-4-32B, built for deep mathematical, logical, and code-oriented problem solving. It applies extended reinforcement learning—both task-specific and general pairwise preference-based—to improve performance on complex multi-step tasks. Compared to the base GLM-4-32B model, Z1 significantly boosts capabilities in structured reasoning and formal domains. The model supports enforced “thinking” steps via prompt engineering and offers improved coherence for long-form outputs. It’s optimized for use in agentic workflows, and includes support for long context (via YaRN), JSON tool calling, and fine-grained sampling configuration for stable inference. Ideal for use cases requiring deliberate, multi-step reasoning or formal derivations.

Key Specifications
Cost
$$$
Context
32K
Parameters
32B
Released
Apr 17, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Stop Presence Penalty Logit Bias Top P Temperature Seed Min P Reasoning Frequency Penalty Max Tokens
Features

This model supports the following features:

Reasoning
Performance Summary

THUDM: GLM Z1 32B, an enhanced reasoning variant of GLM-4-32B, demonstrates exceptional reliability, consistently providing evaluable responses with a 96th percentile ranking. While its speed tends to be slower, ranking in the 4th percentile across benchmarks, it offers competitive pricing, positioned at the 52nd percentile. The model excels in complex reasoning tasks, achieving 98.0% accuracy in the Reasoning benchmark, placing it in the 96th percentile and making it the most accurate model at its price point. It also achieved perfect 100.0% accuracy in Email Classification, proving to be the most accurate and fastest among models at its price point for this task. General Knowledge performance is strong at 98.0% accuracy. However, its performance in Ethics (93.0% accuracy, 27th percentile) and Instruction Following (51.0% accuracy, 54th percentile) is more moderate, and Coding accuracy is 76.0%. Its core strength lies in its ability to handle multi-step, formal derivations, making it ideal for agentic workflows requiring deliberate thought processes, despite its longer response times.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.02
Completion $0.08

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Novita
Novita | thudm/glm-z1-32b-0414 32K $0.02 / 1M tokens $0.08 / 1M tokens
Chutes
Chutes | thudm/glm-z1-32b-0414 32K $0.02 / 1M tokens $0.08 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by thudm