THUDM: GLM Z1 32B

Text input Text output
Author's Description

GLM-Z1-32B-0414 is an enhanced reasoning variant of GLM-4-32B, built for deep mathematical, logical, and code-oriented problem solving. It applies extended reinforcement learning—both task-specific and general pairwise preference-based—to improve performance on complex multi-step tasks. Compared to the base GLM-4-32B model, Z1 significantly boosts capabilities in structured reasoning and formal domains. The model supports enforced “thinking” steps via prompt engineering and offers improved coherence for long-form outputs. It’s optimized for use in agentic workflows, and includes support for long context (via YaRN), JSON tool calling, and fine-grained sampling configuration for stable inference. Ideal for use cases requiring deliberate, multi-step reasoning or formal derivations.

Key Specifications
Cost
$$$
Context
32K
Parameters
32B
Released
Apr 17, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Logit Bias Reasoning Include Reasoning Stop Seed Min P Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Features

This model supports the following features:

Reasoning
Performance Summary

The THUDM: GLM Z1 32B model, an enhanced reasoning variant of GLM-4-32B, demonstrates exceptional reliability with a 96% success rate, consistently providing usable responses. While offering competitive pricing (55th percentile), its speed performance tends to be slower, ranking in the 4th percentile across benchmarks. The model excels in structured reasoning and formal domains, achieving perfect accuracy in both Email Classification and Reasoning benchmarks. It also shows strong performance in General Knowledge (98% accuracy). However, its Instruction Following (51% accuracy) and Ethics (93% accuracy, 26th percentile) capabilities are more moderate, and its Coding performance (76% accuracy, 37th percentile) is a relative weakness. The model's strength lies in its ability to handle complex, multi-step tasks, supported by its extended reinforcement learning and long context capabilities. Its slower response times are a trade-off for its enhanced reasoning and reliability, making it ideal for agentic workflows requiring deliberate, multi-step analysis rather than rapid-fire responses.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.04
Completion $0.14

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Novita
Novita | thudm/glm-z1-32b-0414 32K $0.04 / 1M tokens $0.14 / 1M tokens
Chutes
Chutes | thudm/glm-z1-32b-0414 32K $0.04 / 1M tokens $0.14 / 1M tokens
Chutes
Chutes | thudm/glm-z1-32b-0414 32K $0.04 / 1M tokens $0.14 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by thudm