Advertisement

Xiaomi has quietly stepped into the large language model space with MiMo-7B, its first publicly available open-source AI system. Built by the newly assembled Big Model Core Team, MiMo-7B focuses specifically on reasoning-heavy tasks and outperforms competitors from OpenAI and Alibaba in maths reasoning and code generation. 

As the name suggests, MiMo-7B is a 7-billion-parameter model. Despite being significantly smaller than most top-tier LLMs, Xiaomi claims it performs on par with more bloated systems, including OpenAI’s o1-mini and Alibaba’s Qwen-32B-Preview. All three are capable of AI reasoning. 

Xiaomi MiMo 7B performance
Xiaomi MiMo-7B outperforms OpenAI and Alibaba’s model in mathematical reasoning (AIME 24-25) and code competition (LiveCodeBench v5)

Xiaomi’s MiMo-7B backbone

The backbone of MiMo-7B is a tight pre-training regimen. Xiaomi says it compiled a dense dataset of 200 billion reasoning tokens and fed the model 25 trillion tokens in total over three training phases. 

The company also used a multiple-token prediction objective instead of the standard next-token prediction, claiming it shortens inference time without sacrificing output quality.

The post-training process involves a mix of reinforcement learning techniques and infrastructure improvements. Xiaomi used a custom algorithm dubbed Test Difficulty Driven Reward to address the sparse reward signals that often plague RL tasks involving complex algorithms. Additionally, Xiaomi implemented an Easy Data Re-Sampling method to stabilize training.

Infrastructure-wise, the company built a Seamless Rollout system to reduce GPU downtime during training and validation. The result, at least according to their internal numbers, is a 2.29× improvement in training speed and a nearly 2× jump in validation performance.

The rollout engine is also designed to support inference strategies like multiple-token prediction in vLLM environments.

MiMo-7B is now open source

There are four public versions of MiMo-7B:

  • Base: the raw, pre-trained model
  • SFT: a version fine-tuned with supervised data
  • RL-Zero: a reinforcement-learned variant starting from the base
  • RL: a more polished model built on the SFT version, said to deliver the highest accuracy
Xiaomi MiMo 7B versions

And Xiaomi does have benchmarks to back up the pitch, at least on paper. In math, the MiMo-7B-RL version reportedly scores 95.8% on MATH-500 and over 68% on the 2024 AIME dataset. For code, it lands 57.8% on LiveCodeBench v5 and just under 50% on version 6. Broader general knowledge tasks like DROP, MMLU-Pro, and GPQA are also represented, though the scores sit in the mid-to-high 50s—respectable for a 7B model, but nothing revolutionary.

MiMo-7B is now available on Hugging Face under an open-source license. Whereas, you can check all the supporting documentation and model checkpoints on GitHub.

For more daily updates, please visit our News Section.

Don’t miss a thing! Join our Telegram community for instant updates and grab our free daily newsletter for the best tech stories!

Comments