Q

Qwen3 VL 30B A3B Instruct

Multimodalby Qwen·Model page

Qwen's 30B MoE vision-language model for image and text understanding with a 262k-token context.

Share:

Model Card

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception...

Author
Q
Qwen
Organization
Qwen
Details
Downloads
Likes
AccessOpen Source
Context262K tokens
Input price$0.13 /1M
Output price$0.52 /1M
Knowledge cutoffMar 31, 2025
CreatedOct 6, 2025
Updated
View on Hugging Face
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

Qwen3 VL 30B A3B Instruct — AI Model Details | Applied