Skip to main content

Latest Generation A.I. Systems Show Rising Hallucination Rates, Raising Concerns for Reliability

5 min read
1,028 words
Share:

A new wave of powerful artificial intelligence systems—from leading global tech companies like OpenAI, Google, and DeepSeek—are increasingly generating factual errors despite their advanced capabilities, sparking growing concerns among users, researchers, and businesses worldwide. As these A.I. bots become more capable at tasks like complex reasoning and mathematics, their tendency to produce incorrect or entirely fabricated information—known as “hallucinations”—is not only persisting but actually worsening, as revealed in a recent investigative report by The New York Times (nytimes.com).

For Thai readers, whose lives, work, and even education are being rapidly transformed by automated chatbots and digital assistants, this trend highlights critical risks. When A.I. systems are relied upon for important decisions or information—like medical advice, legal guidance, or sensitive business operations—these hallucinations can lead to costly errors and loss of trust.

Recent incidents shed light on the severity of the issue. A case involving Cursor, a programming tool, saw an A.I.-powered tech support bot falsely notify customers of a policy change, prompting confusion, backlash, and cancelled subscriptions. The company’s leadership clarified later that the supposed policy never existed—a textbook A.I. hallucination. Similar stories are becoming commonplace on Thai online forums, especially as young professionals and students rely on chatbots for research, translation, or exam preparation.

The fundamental problem, experts agree, is rooted in how these A.I. systems are trained and how they “decide” what to say. Modern chatbots—including those tied to internet search engines such as Google and Bing—use vast datasets and statistical models, effectively guessing the “right” answer rather than applying a strict rule set. As a result, mistakes are inevitable. Amr Awadallah, chief executive of Vectara and a former Google executive, put it simply: “Despite our best efforts, they will always hallucinate. That will never go away.”

Recent research is painting an even starker picture. The most advanced OpenAI models (referred to as o3 and o4-mini), have shown dramatically increased hallucination rates. On one benchmark test (PersonQA), the latest OpenAI system hallucinated 33% of the time—over twice the rate of its predecessor. On more general question-answering benchmarks (SimpleQA), its error rate soared as high as 79%. Tests on Google’s reasoning models and DeepSeek’s R1 have shown similar or greater error frequencies.

For Thai educators and students, especially those embracing digital learning platforms or smart classroom assistants, these findings are particularly troubling. The widespread roll-out of A.I. in Thailand’s education sector—from language tutoring apps to grading software—has been touted as a leap forward, but if these tools cannot reliably distinguish fact from fiction, there is a risk that misinformation could quietly become embedded in the learning process.

The underlying challenge is only made worse by a widely-adopted new training approach called “reinforcement learning.” In this setup, A.I. learns primarily through trial and error, adjusting its actions to maximize rewards—leading to surges in skills like math, but instability in factual accuracy. As highlighted by Laura Perez-Beltrachini, a University of Edinburgh researcher focusing on the hallucination problem, “The way these systems are trained, they will start focusing on one task—and start forgetting about others.”

The complexity of these deep neural networks is such that even leading scientists don’t fully understand how or why hallucinations happen. Professor Hannaneh Hajishirzi from the University of Washington, part of a team attempting to trace A.I. behavior back to its training data, admits that the scale and mystery remain daunting: “We still don’t know how these models work exactly,” she said, underlining the urgent need for additional research and increased transparency from tech companies.

As global companies near the limits of available English-language internet data for feeding their A.I. systems, the push for improvement now increasingly relies on these less-predictable training tactics. The result has been a notable decline in factual reliability at the very moment when people are starting to trust and use A.I. assistants for more consequential work.

According to internal data released by leading technology companies and independently verified by firms like Vectara, newer reasoning models are often making up information in as few as 1-2% of complex summarization tasks, but this figure can spike as high as 27% or more depending on the system and task. For business leaders, medical professionals, policy-makers, and educators in Thailand, this puts a spotlight on the need for rigorous verification and oversight.

The issue has implications in legal and ethical spheres, too. The New York Times is currently pursuing legal action against OpenAI and partner Microsoft, alleging copyright infringement relating to news content used in A.I. training—raising broader questions about intellectual property, data privacy, and the responsibilities of A.I. developers (see: nytimes.com). These debates are particularly relevant in Thailand, where digital literacy and legal frameworks around A.I. remain works-in-progress and where new A.I.-powered services are proliferating faster than effective regulation.

Looking ahead, the proliferation of flawed yet powerful A.I. systems will require a more cautious and pragmatic approach. Industry leaders are promising improvements. OpenAI, for example, says it is actively researching ways to reduce hallucinations, with spokesperson Gaby Raila stating, “We’ll continue our research on hallucinations across all models to improve accuracy and reliability.” However, as history shows, technical fixes may not be a panacea.

Practical recommendations for Thai readers, whether individuals, educators, or business stakeholders, include: never relying solely on A.I. for critical decisions; double-checking all facts, figures, and source citations produced by chatbots; and staying updated on the latest guidance from reputable institutions. Institutions like the Ministry of Digital Economy and Society, as well as leading Thai universities, are well positioned to issue clear usage guidelines, promote digital literacy in schools, and encourage responsible deployment of A.I. in both public and private sectors. Media literacy campaigns, similar to those used to combat online scams and fake news, should now include reference to A.I.-generated errors.

In summary, while the promise of A.I. in revolutionizing work, study, and daily life in Thailand is considerable, the risk of hallucination is here to stay—at least for now. As a community, Thais must adapt by combining the power of advanced digital tools with a healthy dose of traditional scepticism and fact-checking habits, ensuring that human judgment and cultural context guide our embrace of new technology.

Sources:

Related Articles

5 min read

As AI Gets Smarter, Its Hallucinations Get Worse: New Research Raises Industry Alarms

news artificial intelligence

Artificial Intelligence systems, particularly the large language models that drive today’s chatbots and virtual assistants, are experiencing a troubling twist in their evolution: the more advanced and “intelligent” they become, the more likely they are to fabricate convincing but false information—a phenomenon known as AI hallucination. New research and industry reporting reveal that the latest generation of “reasoning” AI models, despite appearing more capable and articulate, are showing a dramatic increase in these errors, raising serious concerns for everyday users and global industries alike.

#AI #Technology #Education +8 more
3 min read

Rise of AI Sparks Debate Over Human Relevance and the Future of Work

news artificial intelligence

The accelerating progress of artificial intelligence has sparked an intense international debate about the future role of human beings in society, with leading thinkers warning that AI could soon outperform people at nearly every task — raising fundamental questions about human relevance in work, culture, and decision-making. As technologies grow more sophisticated and autonomous, experts urge communities, governments, and businesses to grapple with the possibility of a world where AI is “better at everything,” and to consider urgent societal adaptations.

#AI #ArtificialIntelligence #FutureOfWork +7 more
7 min read

Humanity-Ending AI? Exploring the Latest Research on Existential Risks

news artificial intelligence

The global debate over the risks posed by artificial intelligence (AI) has reached a new fever pitch, with leading researchers, tech executives, and policymakers openly questioning whether AI could one day pose a true existential threat to humanity. Recent studies and expert panels have challenged both alarmist and skeptical views—and reveal that public concern may be more nuanced than headlines suggest.

Recent months have seen questions about AI’s potential for disaster take centre stage in academic journals, global news media, and even in major tech conferences. The high-profile article “Behind the Curtain: What if predictions of humanity-destroying AI are right?” published by Axios, thrusts this conversation into urgent focus. The central question: What if the so-called “AI doomers” are correct, and humanity is genuinely at risk from the unchecked development of intelligent machines capable of self-improvement or unpredictable behaviour? This provocative scenario is not limited to science fiction; it now commands the attention of some of the world’s leading scientific minds and regulatory bodies.

#AI #ExistentialRisk #Anthropic +10 more

Medical Disclaimer: This article is for informational purposes only and should not be considered medical advice. Always consult with qualified healthcare professionals before making decisions about your health.