Chinese researchers have unveiled a significant security loophole in widely used commercial multimodal large language models (MLLMs) such as ChatGPT, Bard, and Bing Chat. These models, deployed by tech giants like Google, Microsoft, and Baidu, are fundamental components of various applications, from virtual assistants to content moderation systems.

The researchers discovered that the vulnerabilities in these MLLMs could be exploited using manipulated images that closely resembled the originals. By making minute alterations almost invisible to the human eye, the researchers effectively bypassed the models’ built-in filters designed to weed out toxic or inappropriate content.

For instance, researchers in Beijing have identified a significant vulnerability in AI models like ChatGPT. Under attack, these models could mistake giant pandas for humans or fail to detect harmful content, highlighting a critical security flaw in commercial AI systems.

Among the affected models, Bard, equipped with face and toxicity detection mechanisms, could generate inappropriate descriptions of harmful content when compromised. The Chinese research team even provided code demonstrating how these adversarial examples could mislead AI models. Their experiments yielded a success rate of 22% against Bard, 26% against Bing Chat, and a staggering 86% against Ernie Bot.

Wu Zhaohui, China’s vice minister of science and technology, addressed these alarming findings at the Global AI Security Summit in the UK. He emphasized the urgent need for stronger technical risk controls in AI governance, urging the global community to address the vulnerabilities discovered in these widely used language models.

One of the key challenges highlighted by the research is the existing imbalance between efforts focused on attacking and defending AI models. While adversarial attacks have garnered significant attention, there remains a lack of robust defense strategies. Traditional defense methods might come at the cost of accuracy and computational resources, making it imperative to explore innovative solutions.

To address these vulnerabilities, the researchers suggested preprocessing-based defenses as a potential solution, especially for large-scale foundation models. These defenses aim to ensure the robustness of MLLMs against adversarial attacks, paving the way for future research and development in AI security.

This discovery underscores the critical importance of enhancing AI technologies’ security infrastructure. As these models become increasingly integrated into everyday applications, it is essential to fortify their defenses against malicious exploitation, ensuring a safer and more secure digital landscape for users worldwide.

Related:

(via)