OpenAI's o3 and o4-mini hallucinate way higher than previous models
MashableAsia·2025-04-20 12:01
By OpenAI's own testing, its newest reasoning models, o3 and o4-mini, hallucinate significantly higher than o1.
First reported by TechCrunch, OpenAI's system card detailed the PersonQA evaluation results, designed to test for hallucinations. From the results of this evaluation, o3's hallucination rate is 33 percent, and o4-mini's hallucination rate is 48 percent — almost half of the time. By comparison, o1's hallucination rate is 16 percent, meaning o3 hallucinated about twice as often.
……Read full article on MashableAsia
Technology International
One-stop lifestyle app dedicated to making life in Singapore a breeze!
Comments
Leave a comment in Nestia App