XiaomiPaid

MiMo-V2.5

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...

xiaomi/mimo-v2.5
💬 Chat with MiMo-V2.5

Capabilities

🧠Reasoning👁️Vision🎵Audio🎬Video🔧Tools🧩Structured📜Long context

Specifications

Context window
1.0M tokens
Input price
$0.10/M
Output price
$0.28/M
Provider
Xiaomi
Input modalities
text, audio, image, video
Output modalities
text
Pricing
Pay-per-token
Model ID
xiaomi/mimo-v2.5

Strengths

  • +Strong step-by-step reasoning
  • +Understands images (vision input)
  • +Handles audio input/output
  • +Supports tool / function calling
  • +Large 1.0M-token context window
  • +Low cost per token

Considerations

  • No notable limitations for general use

More from Xiaomi

Alternatives