B

UI-TARS 7B

Multimodalby ByteDance·Model page

ByteDance's 7B multimodal model for GUI understanding and automated UI interaction.

Share:

Model Card

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

Author
B
ByteDance
Organization
bytedance
Details
Downloads
Likes
AccessOpen Source
Context128K tokens
Input price$0.1 /1M
Output price$0.2 /1M
Knowledge cutoffJan 31, 2025
CreatedJul 22, 2025
Updated
View on Hugging Face
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

UI-TARS 7B — AI Model Details | Applied