xiaomi

Xiaomi: MiMo-V2.5

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

Quality Score
100/100
composite of price, context, capability
Input Price
$0.40
per 1M tokens
Output Price
$2.00
per 1M tokens
Context Window
1,048,576
tokens
Model ID
xiaomi/mimo-v2.5
Vendor
xiaomi
Tokenizer
Other
Input Modalities
text, audio, image, video
Output Modalities
text
Max Output
131,072 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
✓ supported
Vision
✓ accepts images
Audio
✓ accepts audio
Moderated
no

Strong choice for

Similar models