XiaomiPaid
MiMo-V2.5
MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...
xiaomi/mimo-v2.5
Capabilities
🧠Reasoning👁️Vision🎵Audio🎬Video🔧Tools🧩Structured📜Long context
Specifications
Context window
1.0M tokens
Input price
$0.10/M
Output price
$0.28/M
Provider
Xiaomi
Input modalities
text, audio, image, video
Output modalities
text
Pricing
Pay-per-token
Model ID
xiaomi/mimo-v2.5
Strengths
- +Strong step-by-step reasoning
- +Understands images (vision input)
- +Handles audio input/output
- +Supports tool / function calling
- +Large 1.0M-token context window
- +Low cost per token
Considerations
- –No notable limitations for general use