OpenAI's o3 and o4-mini hallucinate way higher than previous models

OpenAI's o3 and o4-mini hallucinate way higher than previous models

MashableAsia·2025-04-20 12:01

By OpenAI's own testing, its newest reasoning models, o3 and o4-mini, hallucinate significantly higher than o1.

First reported by TechCrunch, OpenAI's system card detailed the PersonQA evaluation results, designed to test for hallucinations. From the results of this evaluation, o3's hallucination rate is 33 percent, and o4-mini's hallucination rate is 48 percent — almost half of the time. By comparison, o1's hallucination rate is 16 percent, meaning o3 hallucinated about twice as often.

……

Read full article on MashableAsia

Technology International