THUDM: GLM 4.1V 9B Thinking

Name: THUDM: GLM 4.1V 9B Thinking
Brand: thudm
Availability: OutOfStock
Rating: 2.1 (7 reviews)

Back

Text input Image input Text output Unavailable

Author's Description

GLM-4.1V-9B-Thinking is a 9B parameter vision-language model developed by THUDM, based on the GLM-4-9B foundation. It introduces a reasoning-centric "thinking paradigm" enhanced with reinforcement learning to improve multimodal reasoning, long-context understanding (up to 64K tokens), and complex problem solving. It achieves state-of-the-art performance among models in its class, outperforming even larger models like Qwen-2.5-VL-72B on a majority of benchmark tasks.

Key Specifications

Cost

$$$

Context

65K

Parameters

Released

Jul 11, 2025

Speed

★

Ability

★★

Reliability

★★

Hugging Face

Supported Parameters

This model supports the following parameters:

Seed Temperature Max Tokens Top P Presence Penalty Frequency Penalty Reasoning Include Reasoning Stop

Features

This model supports the following features:

Reasoning

Performance Summary

The THUDM: GLM 4.1V 9B Thinking model, created on July 11, 2025, demonstrates exceptional speed and cost-efficiency, consistently ranking among the fastest and most competitively priced models across seven benchmarks. Its reliability is strong, achieving an 89% success rate, indicating consistent operational stability. While the model's "thinking paradigm" aims for advanced reasoning, its performance across specific benchmarks presents a mixed picture. It exhibits a critical weakness in Instruction Following, scoring 0.0% accuracy, which is a significant concern for tasks requiring precise directive adherence. However, it shows strong capabilities in Email Classification and Ethics, both achieving 94.0% accuracy, suggesting proficiency in nuanced categorization and moral reasoning. Its Reasoning benchmark score of 72.0% is respectable, placing it in the 60th percentile. Performance in Coding, General Knowledge, and Mathematics is moderate, with accuracies ranging from 70.0% to 74.0%, but these scores often fall below the 35th percentile, indicating room for improvement compared to peers. The model's strength lies in its efficiency and its ability to handle specific classification and ethical tasks effectively, while its primary weakness is its inability to follow instructions.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0.035
Completion	$0.138

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
Novita	Novita \| thudm/glm-4.1v-9b-thinking	65K	$0.035 / 1M tokens	$0.138 / 1M tokens
Novita	Novita \| thudm/glm-4.1v-9b-thinking	65K	$0.035 / 1M tokens	$0.138 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by thudm

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
THUDM: GLM Z1 Rumination 32B Unavailable	Apr 25, 2025	32B	32K	Text input Text output	★	★★★★	$$$$
THUDM: GLM Z1 32B Unavailable	Apr 17, 2025	32B	32K	Text input Text output	★	★★★★	$$$
THUDM: GLM 4 32B Unavailable	Apr 17, 2025	32B	32K	Text input Text output	★★	★	$$$