Q

Qwen2.5 VL 72B Instruct

Multimodalby Qwen·Model page

Qwen's 72B-parameter vision-language model with a 128K-token context for multimodal understanding and reasoning.

Share:

Model Card

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.

Author
Q
Qwen
Organization
Qwen
Details
Downloads
Likes
AccessOpen Source
Context131K tokens
Input price$0.8 /1M
Output price$1 /1M
Knowledge cutoffJun 30, 2024
CreatedFeb 1, 2025
Updated
View on Hugging Face
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

Qwen2.5 VL 72B Instruct — AI Model Details | Applied