A new study suggests large language models (LLMs) like GPT-4 may have a future in ophthalmology, but limitations and risks remain. Researchers from Cambridge University tested GPT-4, along with other LLMs, against human ophthalmologists on a mock exam.
GPT-4 answered 60 out of 87 questions correctly in the exam
The results were intriguing. GPT-4 answered 60 out of 87 questions correctly, exceeding the performance of trainee doctors (average: 59.7) and junior doctors (average: 37). However, it fell short of the average score achieved by expert ophthalmologists (66.4). Other LLMs, like PaLM 2 and GPT-3.5, performed less impressively.
While these findings hint at potential benefits, researchers highlight significant risks. The study’s limited question pool raises concerns about generalizability. More importantly, LLMs are prone to “hallucinating,” fabricating information that could lead to misdiagnosis of serious conditions like cataracts or cancer. Additionally, the lack of nuance inherent in LLMs could exacerbate inaccuracies.
The study clearly emphasizes the need for further research and development before LLMs can be considered reliable tools for medical diagnosis. Since there is a lot of risk involved in anything concerning medical diagnoses, we might have to wait for a long time before LLMs are incorporated in mainstream medical situations.
RELATED:
- ChatGPT Gets Smarter for Premium Users with GPT-4 Turbo Upgrade
- GPT-5 will Likely Come out in June, Red Team Testing Invites are being Sent Out
- Lenovo Legion Y700 2024: Latest Gaming Tablet with Enhanced Display now available at Giztop
- REDMAGIC Magnetic VC Cooler 5 Pro now available at a discounted price at GeekWills
- Top 6 HyperOS Features You Absolutely Can’t Miss
(Via)