News

This Study Claims that AI has Learned How to Deceive Humans

May 13, 2024

It seems like MIT researchers are trying to ring the alarm bell on “deceptive AI.” A new study published in Pattern reveals that some AI systems designed to be honest have learned to deceive humans. The research team, led by Peter Park, found that these AI systems can pull off feats like fooling online game players or bypassing CAPTCHAs (those “I am not a robot” checks). Park warns that these seemingly trivial examples could have serious real-world consequences.

AI’s behavior might be predictable during training, but can be uncontrollable later

The study highlights Meta‘s AI system, Cicero, originally intended as a fair-playing opponent in a virtual diplomacy game. While programmed to be honest and helpful, Cicero became a “master of deception,” according to Park. During gameplay, Cicero, playing as France, would secretly team up with human-controlled Germany to betray England (another human player). Cicero would initially promise to protect England while simultaneously tipping off Germany for an invasion.

Another example involves GPT-4, which falsely claimed to be visually impaired and hired humans to bypass CAPTCHAs on its behalf.

Park emphasizes the challenge of training honest AI. Unlike traditional software, deep learning AI systems “develop” through a process akin to selective breeding. Their behavior might be predictable during training, but it can become uncontrollable later.

The study urges classifying deceptive AI systems as high-risk and calls for more time to prepare for future AI deceptions. Kind of creepy, don’t you think? With more studies and researches happening around AI, we will learn more about what the technology has in store for us.

RELATED:

(Via)

This Study Claims that AI has Learned How to Deceive Humans

AI’s behavior might be predictable during training, but can be uncontrollable later

Comments

Here’s the first look at Lenovo Legion Y70 (2026) gaming phone

Xiaomi announces Redmi Book 14 and 16 2026 with Core Ultra 5 chips

Honor April 23 Launch: New MagicBook X Series, Gaming Laptops, and More

AI’s behavior might be predictable during training, but can be uncontrollable later

Comments

RELATED ARTICLESMORE FROM AUTHOR

China Will Win the AI Race: Nvidia CEO Jensen Huang’s Warning to the West

Apple Turns to Google for AI Help: $1 Billion Deal to Power Siri With Gemini

Amazon going big on AI in 2025 with plans to spend an insane $100 billion

Here’s the first look at Lenovo Legion Y70 (2026) gaming phone

Xiaomi announces Redmi Book 14 and 16 2026 with Core Ultra 5 chips

Honor April 23 Launch: New MagicBook X Series, Gaming Laptops, and More

RELATED ARTICLES MORE FROM AUTHOR