X

MiMo-V2.5

Multimodalby Xiaomi·Model page

Xiaomi's MiMo V2.5 multimodal model with a 1M-token context for text, audio, image, and video inputs.

Share:

Model Card

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

Author
X
Xiaomi
Organization
xiaomi
Details
Downloads
Likes
AccessOpen Source
Context1M tokens
Input price$0.14 /1M
Output price$0.28 /1M
CreatedApr 22, 2026
Updated
View on Hugging Face
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

MiMo-V2.5 — AI Model Details | Applied