Microsoft employee discovers a vulnerability in OpenAI’s DALL-E 3 that allows it to generate inappropriate content

Feb 2, 2024

Shane Jones, a manager in Microsoft‘s software engineering department, uncovered a vulnerability in OpenAI‘s DALL-E 3 model, known for generating text-based images. The flaw allows the model to bypass AI Guardrails and generate inappropriate NSFW content. After reporting the issue internally, Jones faced a “Gagging Order” from Microsoft, prohibiting him from disclosing the vulnerability. Despite the directive, Jones opted to share the information publicly, expressing concerns about potential security risks.

However, Microsoft downplayed the severity of the vulnerability questioning the success rate

Jones stumbled upon the vulnerability during independent research in December. He promptly reported the issue to both Microsoft and OpenAI. In an open letter on LinkedIn, Jones emphasized the security risks posed by the vulnerability and urged OpenAI to temporarily suspend the DALL-E 3 model until the flaw was addressed. Microsoft’s response to Jones was swift and forceful, instructing him to remove the LinkedIn post without providing any explanation.

Despite seeking internal communication with Microsoft to address the issue, Jones received no response. Frustrated by the lack of action, he decided to disclose the vulnerability to the media and relevant authorities. Jones linked the vulnerability to recent incidents of AI-generated inappropriate content featuring well-known singer Taylor Swift, allegedly created using Microsoft’s Designer AI function, which is underpinned by the DALL-E 3 model.

Microsoft’s legal department and senior executives warned Jones to cease disclosing information externally, but the vulnerability remained unpatched. Engadget and other media outlets sought an official response from Microsoft, prompting the company to acknowledge the concerns raised by Jones. Microsoft assured that it would address the issues brought to light and fix the vulnerabilities.

However, the company downplayed the severity of the disclosed vulnerability, stating it had a low success rate and could not completely bypass Microsoft’s security mechanisms. Microsoft also cast doubt on whether the vulnerability was related to the Taylor Swift incident, emphasizing the need for further investigation.

The incident underscores the challenges and ethical considerations surrounding AI technology, particularly in managing and addressing vulnerabilities that have the potential to compromise user safety and generate inappropriate content.

Related:

Microsoft employee discovers a vulnerability in OpenAI’s DALL-E 3 that allows it to generate inappropriate content

However, Microsoft downplayed the severity of the vulnerability questioning the success rate

Vivo V30e With 5,500mAh Battery, Curved AMOLED Display Launched In India

These Google Pixel Devices Will Get Android 15 Update

Caviar unveils $44,000 Porsche eBike with 18k gold and free iPhone 16

However, Microsoft downplayed the severity of the vulnerability questioning the success rate

RELATED ARTICLESMORE FROM AUTHOR

Microsoft’s Leaked Internal Emails Shed Light on the Company’s Competition with Google AI and More

Microsoft Windows 10 Updates will Get Smaller in Size, Company is Streamlining the Update Process

Surface Laptop 6 appeared on GeekBench with Snapdragon X Elite

Vivo V30e With 5,500mAh Battery, Curved AMOLED Display Launched In India

These Google Pixel Devices Will Get Android 15 Update

Caviar unveils $44,000 Porsche eBike with 18k gold and free iPhone 16

RELATED ARTICLES MORE FROM AUTHOR