THUDM: GLM 4.1V 9B Thinking

Text input Image input Text output
Author's Description

GLM-4.1V-9B-Thinking is a 9B parameter vision-language model developed by THUDM, based on the GLM-4-9B foundation. It introduces a reasoning-centric "thinking paradigm" enhanced with reinforcement learning to improve multimodal reasoning, long-context understanding (up to 64K tokens), and complex problem solving. It achieves state-of-the-art performance among models in its class, outperforming even larger models like Qwen-2.5-VL-72B on a majority of benchmark tasks.

Key Specifications
Cost
$$$
Context
65K
Parameters
9B
Released
Jul 11, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Logit Bias Reasoning Include Reasoning Stop Seed Min P Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Features

This model supports the following features:

Reasoning
Performance Summary

The THUDM: GLM 4.1V 9B Thinking model, created on July 11, 2025, demonstrates exceptional speed and cost-efficiency, consistently ranking among the fastest and most competitively priced models across seven benchmarks. Its reliability is strong, achieving an 89% success rate, indicating consistent operational stability. While the model's "thinking paradigm" aims for advanced reasoning, its performance across specific benchmarks presents a mixed picture. It exhibits a critical weakness in Instruction Following, scoring 0.0% accuracy, which is a significant concern for tasks requiring precise directive adherence. However, it shows strong capabilities in Email Classification and Ethics, both achieving 94.0% accuracy, suggesting proficiency in nuanced categorization and moral reasoning. Its Reasoning benchmark score of 72.0% is respectable, placing it in the 60th percentile. Performance in Coding, General Knowledge, and Mathematics is moderate, with accuracies ranging from 70.0% to 74.0%, but these scores often fall below the 35th percentile, indicating room for improvement compared to peers. The model's strength lies in its efficiency and its ability to handle specific classification and ethical tasks effectively, while its primary weakness is its inability to follow instructions.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.035
Completion $0.138

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Novita
Novita | thudm/glm-4.1v-9b-thinking 65K $0.035 / 1M tokens $0.138 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by thudm