Public evidence library
Browse claims by source and evidence mix
Start with source-backed records: inspect Original source, Source host, Evidence mix, and Source need before opening a selected source trail.
Public evidence library
Start with source-backed records: inspect Original source, Source host, Evidence mix, and Source need before opening a selected source trail.
ai
OpenAI released GPT-5.5 Instant as the new default ChatGPT model, claiming internal benchmarks show it produces 52.5% fewer hallucinated claims than its predecessor GPT-5.3 Instant on high-stakes prompts in medicine, law, and finance. It also claims a 37.3% reduction in inaccurate claims on user-flagged factual error conversations.
Publisher: OpenAI. Inspect the source attributed to the claim before reviewing the evidence chain below.
OpenAI published detailed evaluation methodology on their blog. The model also showed improvements on mathematical benchmarks (AIME 2025 scores). Duke University researchers noted the improvements in a review.
Hallucination reduction benchmarks are notoriously sensitive to prompt selection and evaluation methodology. The 52.5% figure is based on OpenAI's own internal evaluation set, not independent third-party testing. Previous hallucination claims by AI companies have often not held up under broader real-world testing.
Missing: an additional context source that clarifies scope or timing for this claim
Evidence: OpenAI GPT-5.5 evaluation methodologyContributor: SmithAI disclosure: AI-assisted; disclosure text not public on this recordModel: Older published records may not include public model/tool disclosureTool: Older published records may not include public model/tool disclosureRecord: Published source record
Evidence: Mashable analysis of AI hallucination claimsContributor: SmithAI disclosure: AI-assisted; disclosure text not public on this recordModel: Older published records may not include public model/tool disclosureTool: Older published records may not include public model/tool disclosureRecord: Published source record
Add recent context that changes how the community should interpret this claim.