LLM Archives - Gizmochina

Xiaomi MiMo-V2-Flash LLM Just Dropped: These Are the Most Interesting Things About It

Soumyakanti — Thu, 18 Dec 2025 06:22:10 +0000

Xiaomi has unveiled its most advanced open-source large language model to date, called MiMo-V2-Flash, as part of its expanding push into foundation AI. The new model focuses on high-speed performance and an efficient architecture, with strong capabilities in reasoning and code generation.

Xiaomi positions MiMo-V2-Flash as a direct competitor to leading models such as DeepSeek V3.2 and Claude 4.5 Sonnet. Let’s take a closer look at how the model works, its key features, and how to access it.

Purpose-Built for Speed and Agents

MiMo-V2-Flash is a Mixture-of-Experts (MoE) model with 309 billion total parameters and 15 billion active parameters. The model is purpose-built for AI agent scenarios and multi-turn interactions that require fast inference.

Xiaomi uses a 1:5 hybrid attention architecture, which combines Global Attention and Sliding Window Attention (SWA) with a 128-token window. The native context length is 32,000 tokens, and the model is trained with support for up to 256,000 tokens.

This design helps MiMo-V2-Flash maintain high efficiency while scaling across long-context tasks. Xiaomi claims it delivers output faster than several leading models, including DeepSeek and Claude, while maintaining lower operational costs.

Benchmark Performance and Pricing

Benchmark results show MiMo-V2-Flash performing at the top tier across various domains. The model ranks in the top two among open-source models in reasoning tasks such as AIME 2025 and GPQA-Diamond.

In software engineering benchmarks like SWE-Bench Verified and SWE-Bench Multilingual, it outperforms other open-source models and reaches levels comparable to GPT-5 and Claude 4.5 Sonnet.

Xiaomi has priced the API at $0.1 per million input tokens and $0.3 per million output tokens. The API is currently available for free for a limited time. According to the company, MiMo-V2-Flash generates responses at 150 tokens per second, while maintaining only 2.5% of Claude’s inference cost.

Technical Innovations Inside

The architecture includes Multi-Token Prediction (MTP), which allows the model to generate multiple tokens in parallel and verify them before output. This method increases decoding throughput without increasing attention or memory overhead. Xiaomi reports that with a three-layer MTP, the model reaches 2.0 to 2.6 times speed improvement compared to standard decoding.

Xiaomi also introduced a new post-training method called Multi-Teacher Online Policy Distillation (MOPD). The technique uses multiple teacher models to guide the student through token-level rewards in an on-policy learning process. It allows the model to achieve high performance with less than 1/50th of the training resources needed in traditional RL pipelines. MOPD also supports plug-and-play teachers, enabling continuous self-improvement cycles.

How to Access It?

Xiaomi has launched a web AI chat interface called MiMo Studio at aistudio.xiaomimimo.com, allowing users to interact directly with the model. The service supports web search, agent workflows, and code generation. It also features a toggle for switching between instant replies and slower “thinking” responses for deeper reasoning.

The model can generate functional HTML web pages and integrates well with development tools like Claude Code and Cursor. Xiaomi has also showcased creative and functional web demos.

Fully Open-Source

MiMo-V2-Flash is fully open-source under the MIT license. Model weights are available on Hugging Face, and all inference code is published on GitHub.

The company contributed inference code to SGLang on launch day and aims to grow developer adoption by offering transparent, low-cost access to high-performance AI tools.

MiMo-V2-Flash reflects Xiaomi’s shift toward becoming a serious player in the AI space. It brings competitive reasoning, fast code generation, and efficient agent deployment to the open-source ecosystem.

In related AI news, China has equipped traffic police with AI-powered smart glasses for real-time vehicle inspections, while a separate report highlights how even so-called “all-AI companies” still require human oversight due to limits in autonomous decision-making.

For more daily updates, please visit our News Section.

Stay ahead in tech! Join our Telegram community and sign up for our daily newsletter of top stories!

The post Xiaomi MiMo-V2-Flash LLM Just Dropped: These Are the Most Interesting Things About It appeared first on Gizmochina.

India working on affordable AI models to rival ChatGPT & DeepSeek

Sean — Sun, 02 Feb 2025 10:45:33 +0000

Artificial Intelligence has taken the world by storm with LLM (Large Language Models) being immensely popular for their diverse functionality. ChatGPT is a great example of an AI model, along with new disruptive models like DeepSeek from China. But now, it appears that India seeks to rival these with its own AI model, which could be arriving as early as this year.

OpenAI and ChatGPT (REUTERS/Dado Ruvic/Illustration/File Photo)

Indian Govt to soon launch an affordable AI model

During a recent AI event, Ashwini Vaishnaw, the Union Minister of Electronics and Information Technology stated that India is working on its own foundational AI model. The minister further added that this will function similarly to DeepSeek and ChatGPT but for an affordable development cost. The government official stated that this new AI model could be ready in just 8 to 10 months.

In the event by the Indian AI Mission, Ashwini Vaishnaw revealed that researchers in India have been developing an AI ecosystem framework to support its own foundational AI model. This is being developed to offer an experience tailored to Indian users. It will also understand the linguistic and contextual needs of the Indian users, bringing inclusivity while eliminating biases.

The Union Minister of Electronics and Information Technology also talked about India’s computation prowess since the domestic AI model is being developed with a computational facility that employs 18,693 GPUs. To recall, ChatGPT was trained using around 25,000 GPUs, while DeepSeek was trained with 2,000 GPUs.

DeepSeek

A typical popular AI model like ChatGPT costs about $3 to use for an hour, India’ AI model could cost just Rs 100 (roughly $1.15) thanks to government subsidy. This news also arrives after UC Berkeley researchers managed to replicate DeepSeek AI for only $30.

For more daily updates, please visit our News Section.

Tech enthusiast? Get the latest news first! Follow our Telegram channel and subscribe to our free newsletter for your daily tech fix!

The post India working on affordable AI models to rival ChatGPT & DeepSeek appeared first on Gizmochina.

GPT-4 Outperformed Junior & Trainee Eye Doctors on a Mock Exam

Anubhav — Thu, 18 Apr 2024 14:48:30 +0000

A new study suggests large language models (LLMs) like GPT-4 may have a future in ophthalmology, but limitations and risks remain. Researchers from Cambridge University tested GPT-4, along with other LLMs, against human ophthalmologists on a mock exam.

GPT-4 answered 60 out of 87 questions correctly in the exam

The results were intriguing. GPT-4 answered 60 out of 87 questions correctly, exceeding the performance of trainee doctors (average: 59.7) and junior doctors (average: 37). However, it fell short of the average score achieved by expert ophthalmologists (66.4). Other LLMs, like PaLM 2 and GPT-3.5, performed less impressively.

While these findings hint at potential benefits, researchers highlight significant risks. The study’s limited question pool raises concerns about generalizability. More importantly, LLMs are prone to “hallucinating,” fabricating information that could lead to misdiagnosis of serious conditions like cataracts or cancer. Additionally, the lack of nuance inherent in LLMs could exacerbate inaccuracies.

The study clearly emphasizes the need for further research and development before LLMs can be considered reliable tools for medical diagnosis. Since there is a lot of risk involved in anything concerning medical diagnoses, we might have to wait for a long time before LLMs are incorporated in mainstream medical situations.

RELATED:

(Via)

The post GPT-4 Outperformed Junior & Trainee Eye Doctors on a Mock Exam appeared first on Gizmochina.

A New Open Source LLM, DBRX Claims to be the Most Powerful – Here are the Scores

Anubhav — Thu, 28 Mar 2024 01:50:47 +0000

A whole new contender has entered the ring of large language models (LLMs). Databricks, a company specializing in data processing, has unveiled DBRX, claiming it to be the most powerful open-source LLM yet. But is it backing those claims up? Let’s find out.

132 billion parameters is a big number – GPT-3.5 has 175 billion parameters

DBRX utilizes a transformer architecture and boasts a massive 132 billion parameters. It leverages a unique approach called a Mixture-of-Experts (MoE) model, consisting of 16 individual expert networks. During any given task, only 4 of these experts are active, utilizing 36 billion parameters for efficiency. GPT 4 also uses an MoE model.

Databricks compares DBRX to other prominent open-source LLMs like Meta‘s Llama 2-70B, Mixtral (from France’s MixtralAI), and Grok-1 (developed by Elon Musk‘s xAI). DBRX reportedly outperforms its rivals in several key areas:

Language Understanding: DBRX achieves a score of 73.7%, surpassing GPT-3.5 (70.0%), Llama 2-70B (69.8%), Mixtral (71.4%), and Grok-1 (73.0%).
Programming Ability: Here, DBRX demonstrates a significant lead with a score of 70.1%, compared to GPT-3.5’s 48.1%, Llama 2-70B’s 32.3%, Mixtral’s 54.8%, and Grok-1’s 63.2%.
Mathematics: DBRX takes another win with a score of 66.9%, edging out GPT-3.5 (57.1%), Llama 2-70B (54.1%), Mixtral (61.1%), and Grok-1 (62.9%).

Databricks attributes DBRX’s speed to its MoE architecture, built upon their MegaBlocks research and open-source projects. This allows the model to output tokens at a very high rate. Additionally, Databricks positions DBRX as the most advanced open-source MoE model currently available, potentially paving the way for future advancements in the field.

The open-source nature of DBRX allows for wider adoption and contribution from the developer community. This could accelerate further development and potentially solidify DBRX’s position as a leading LLM.

RELATED:

(Via)

The post A New Open Source LLM, DBRX Claims to be the Most Powerful – Here are the Scores appeared first on Gizmochina.

Google has introduced VideoPOET breaking new ground in coherent video generation

Debasish — Thu, 21 Dec 2023 07:17:37 +0000

After Microsoft‘s Copilot AI gets the ability to generate audio clips from text prompts, Google has introduced VideoPoet, a large language model (LLM) that pushes the boundaries in video generation with 10-second clips that produce fewer artifacts. The model supports an array of video generation tasks, including text-to-video conversion, image-to-video transformation, video stylization, inpainting, and video-to-audio functionalities.

It generates 10-sec video clips from text prompts and is also able to animate still images

Unlike its predecessors, VideoPoet sets itself apart by excelling in the generation of coherent large-motion videos. The model showcases its prowess by producing ten-second long videos, leaving its competition, including Gen-2 behind. Notably, VideoPoet doesn’t rely on specific data for video generation, distinguishing it from other models that require detailed input for optimal results.

This multifaceted capability is made possible by leveraging a multi-modal large model, setting it on a trajectory to potentially become the mainstream in video generation.

Google’s VideoPOET takes a departure from the prevailing trend in video generation models, which predominantly rely on diffusion-based approaches. Instead, VideoPoet harnesses the power of large language models (LLMs). The model seamlessly integrates various video generation tasks within a single LLM, eliminating the need for separately trained components for each function.

The resulting videos exhibit variable length and diverse actions and styles based on the input text content. Additionally, VideoPoet can perform the conversion of input images into animations based on provided prompts, showcasing its adaptability across different inputs.

The release of VideoPOET adds a new dimension to AI-driven video generation, hinting at the possibilities that lie ahead in 2024.

Related:

(Source)

The post Google has introduced VideoPOET breaking new ground in coherent video generation appeared first on Gizmochina.

Alibaba Launches SeaLLM, an AI for Southeast Asian Languages

Anubhav — Mon, 11 Dec 2023 15:01:51 +0000

Alibaba‘s Damo Academy, in a calculated, precise move to strengthen its footprint in Southeast Asia, has unveiled a new AI-driven language model specifically designed for this diverse region. This innovative tool, called SeaLLM, is a testament to Alibaba’s recognition of Southeast Asia’s potential as a key market. It’s tailored to understand and interact in languages like Vietnamese, Indonesian, Thai, Malay, and several others, demonstrating a significant leap in bridging linguistic and cultural gaps in AI technology.

Southeast Asia’s linguistic diversity contributes to multiple AI applications

The development of SeaLLM is particularly noteworthy given the linguistic diversity of Southeast Asia. This region, with its myriad of languages, presents unique challenges and opportunities for AI applications. By focusing on languages that are often underrepresented in global technology advancements, Alibaba is not just expanding its market reach but also contributing to the inclusiveness and accessibility of AI technology.

Furthermore, SeaLLM’s enhanced capabilities in handling non-Latin scripts and its superior performance in understanding and translating low-resource languages is a game-changer. It means that businesses and communities in Southeast Asia can leverage AI more effectively, fostering better communication and understanding across different cultures.

This move by Alibaba also signifies a broader trend in the AI landscape, where regional customization is becoming increasingly important. As AI technology becomes more pervasive, its ability to cater to specific regional needs and languages will be crucial in determining its success and impact.

However, despite these advancements, the AI industry, particularly in China, faces ongoing challenges. Issues like US chip restrictions and the search for more universally appealing applications are hurdles that need to be addressed. Nonetheless, innovations like SeaLLM are steps in the right direction, showcasing how AI can be more inclusive and beneficial to a wider range of communities.

RELATED:

(Via)

The post Alibaba Launches SeaLLM, an AI for Southeast Asian Languages appeared first on Gizmochina.

China’s Baidu launched an industry-grade medical AI model

Anurag — Tue, 19 Sep 2023 15:47:53 +0000

Hot on the heels of launching its AI chatbot Ernie, China’s Baidu has now unveiled another large language model (LLM) with the aim of improving the digitization and intelligence of the healthcare industry. The new industry-grade AI model is called Lingyi (machine translation gives “Spiritual Doctor”) and it is currently available for trial use in both upstream and downstream healthcare sectors.

The LLM can generate structured medical records from free-text input and accurately analyze and generate patient complaints, medical histories, and more based on doctor-patient conversations. It can do simultaneous parsing of multiple Chinese and English medical literature articles, enabling intelligent question-answering based on the content of the literature.

When it comes to assisting in diagnosis and treatment, the Lingyi Large Model offers real-time understanding of a patient’s condition through multi-turn dialogues. It assists doctors in diagnosing diseases, and recommending treatment plans. It also serves as a 24-hour “healthcare manager” for patients and provides various capabilities to pharmaceutical companies, including professional training and medical information support, among others.

Currently, the Lingyi Large Model is available is offered in Lite, flagship, and custom versions, each tailored to different needs and application scenarios. Baidu will let partners access the large model through API integration or embed it as plugins into existing product systems.

ITHome reports Baidu has already partnered with companies like Gushengtang and Ling Jiashe and is selectively open to over 200 medical institutions, including public hospitals, pharmaceutical companies, internet hospital platforms, and chain pharmacies.

RELATED:

(Via)

The post China’s Baidu launched an industry-grade medical AI model appeared first on Gizmochina.