AI chatbots have made their way into classrooms, offices, and everyday life—but they still suffer from a frustrating flaw: sometimes they simply make things up. These so-called “hallucinations” can look convincing but turn out to be completely wrong. OpenAI says it has figured out why this happens, and the company believes it has a fix that could make future AI tools far more trustworthy.

OpenAI recently published a 36-page paper co-authored with Georgia Tech’s Santosh Vempala and others, digging into the problem. The researchers argue that hallucinations aren’t really caused by poor model design, but by the way AI systems are tested and ranked. Current benchmarks often reward a chatbot for answering every question, even if it gets some wrong, while punishing models that hold back when they aren’t sure. Think of it like a multiple-choice exam that encourages guessing instead of leaving blanks.

To counter this, the paper suggests flipping the scoring system: make “confident but wrong” answers count heavily against a model, while rewarding it for showing caution or admitting uncertainty. Early examples highlight the difference. In one benchmark, a cautious model answered only half the questions but got 74% right, while another answered nearly all but hallucinated three out of four times.

If adopted, this approach could change how AI assistants behave day to day. Instead of confidently inventing a source or statistic, they’d be more likely to say, “I don’t know.” That may sound less impressive, but it could save users from having to constantly double-check their answers. For OpenAI, this research is a step toward AI that values accuracy and trust over flashy but unreliable confidence.

Don’t miss a thing! Join our Telegram community for instant updates and grab our free daily newsletter for the best tech stories!

For more daily updates, please visit our News Section.

(Source | Image)

Comments