Z

GLM 5V Turbo

Multimodalby Z.ai·Model page

Z.ai's multimodal model accepting image, text, and video inputs with a 202K-token context window.

Max output

Most tokens the model can return in a single response.

131Ktokens
Share:

Design Arena

Design Arena ranks models on real-world front-end and design tasks — websites, UI components, data viz, SVG and more — through head-to-head human votes, scored as an ELO rating.
CategoryELOWin rateRank
3D129155.3%#19
Game dev128955.0%#20
Code127952.5%#23
Website127050.9%#23
Androidnative126754.8%#3

Pricing

per 1M tokens
Input$1.2 /1M
Output$4 /1M
Cache read$0.24 /1M

Model Description

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding,...

Author
Z
Z.ai
Organization
z-ai
Details
Downloads
Likes
AccessClosed Source
Context203K tokens
Input price$1.2 /1M
Output price$4 /1M
CreatedApr 1, 2026
Updated
View on Hugging Face
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

GLM 5V Turbo — AI Model Details | Applied