Z.ai: GLM 5V Turbo

Text input Image input Video input Text output
Author's Description

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding, and task execution, and works seamlessly with agents to complete the full loop of “perceive → plan → execute“.

Key Specifications
Cost
$$$$$
Context
202K
Released
Apr 01, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Top P Response Format Reasoning Temperature Include Reasoning Tool Choice Max Tokens
Features

This model supports the following features:

Response Format Tools Reasoning
Performance Summary

Z.ai's GLM-5V-Turbo, a native multimodal agent foundation model designed for vision-based coding and agent-driven tasks, demonstrates moderate speed performance, ranking in the 34th percentile across benchmarks. Its pricing tends to be premium, positioned in the 10th percentile. However, the model exhibits exceptional reliability with a 98% success rate, indicating minimal technical failures. GLM-5V-Turbo excels in critical areas, achieving perfect accuracy in Hallucinations, General Knowledge, and Ethics benchmarks, often being the most accurate model at its price point and speed. It also shows strong performance in Mathematics (85th percentile accuracy) and Reasoning (75th percentile accuracy), highlighting its robust analytical capabilities. Instruction Following and Email Classification also show solid results, with 79th and 73rd percentile accuracy respectively. A notable area for improvement is Coding, where it achieves 45th percentile accuracy, suggesting it performs adequately but not exceptionally compared to peers. Its multimodal capabilities, handling image, video, and text inputs, combined with its long-horizon planning and task execution, position it as a powerful tool for complex agentic workflows.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $1.2
Completion $4
Input Cache Read $0.24

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Z.AI
Z.AI | z-ai/glm-5v-turbo-20260401 202K $1.2 / 1M tokens $4 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by z-ai