Why Do Neural Networks Hallucinate (And What Are Experts Doing About It)?
Last Updated on November 11, 2024 by Editorial Team
Author(s): Vitaly Kukharenko
Originally published on Towards AI.
AI hallucinations are a strange and sometimes worrying phenomenon. They happen when an AI, like ChatGPT, generates responses that sound real but are actually wrong or misleading. This issue is especially common in large language models (LLMs), the neural networks that drive these AI tools. They produce sentences that flow well and seem human, but without truly βunderstandingβ the information theyβre presenting. So, sometimes, they drift into fiction. For people or companies who rely on AI for correct information, these hallucinations can be a big problem β they break trust and sometimes lead to serious mistakes.
So, why do these models, which seem so advanced, get things so wrong? The reason isnβt only about bad data or training limitations; it goes deeper, into the way these systems are built. AI models operate on probabilities, not concrete understanding, so they occasionally guess β and guess wrong. Interestingly, thereβs a historical parallel that helps explain this limitation. Back in 1931, a mathematician named Kurt GΓΆdel made a groundbreaking discovery. He showed that every consistent mathematical system has boundaries β some truths canβt be proven within that system. His findings revealed that even the most rigorous systems have limits, things they just canβt handle.
Today, AI researchers face this same kind of limitation. Theyβre working hard to reduce hallucinations and make LLMs more reliable. But the reality is, some limitations are baked into these models. GΓΆdelβs insights help us understand why even our best systems will never be totally βtrustworthy.β And thatβs the challenge researchers are tackling as they strive to create AI that we can truly depend on.
GΓΆdelβs Incompleteness Theorems: A Quick Overview
In 1931, Kurt GΓΆdel shΠΎΠΎk up the worlds of math and logic with two groundbreaking theorems. What he discovered was radical: in any logical system that can handle basiΡ math, there will always be truths that canβt be proven within that system. At the time, mathematicians were striving to create a flawless, all-encompassing structure for math, but GΓΆdel proved that no system could ever be completely airtight.
GΓΆdelβs first theorem showed that every logical system has questions it simply canβt answer on its own. Imagine a locked room with no way out β the system canβt reach beyond its own walls. This was a shock because it meant that no logical structure could ever be fully βfinishedβ or self-sufficient.
To break it down, picture this statement: βThis statement cannot be proven.β Itβs like a brain-twisting riddle. If the system could prove it true, it would contradict itself because the statement says it *canβt* be proven. But if the system canβt prove it, then that actually makes the statement true! This little paradox sums up GΓΆdelβs point: some truths just canβt be captured by any formal system.
Then GΓΆdel threw in another curveball with his second theorem. He proved that a system canβt even confirm its own consistency. Think of it as a book that canβt check if itβs telling the truth. No logical system can fully vouch for itself and say, βIβm error-free.β This was huge β it meant that every system must take its own rules on a bit of βfaith.β
These theorems highlight that every structured system has blind spots, a concept thatβs surprisingly relevant to todayβs AI. Take large language models (LLMs), the AIs behind many of our tech tools. They can sometimes produce what we call βhallucinationsβ β statements that sound plausible but are actually false. Like GΓΆdelβs findings, these hallucinations remind us of the limitations within AIβs logic. These models are built on patterns and probabilities, not actual truth. GΓΆdelβs work serves as a reminder that, no matter how advanced AI becomes, there will always be some limits we need to understand and accept as we move forward with technology.
What Causes AI Hallucinations?
AI hallucinations are a tricky phenΠΎmenΠΎn with roΠΎts in how large language models (LLMs) process language and learn frΠΎm their training data. A hallucination, in AI terms, is when the mΠΎdel produces information that sounds believable but isnβt actually true.
So, why do these hallucinations happen? First, itβs often due to the quality of the training data. AI models learn by analyzing massive amounts of text β books, articles, websites β you name it. But if this data is biased, incompletΠ΅, or just plain wrong, the AI can pick up on these flaws and start making faulty connections. This results in misinformation being delivered with confidence, even though itβs wrong.
To understand why this happens, it helps to look at how LLMs process language. Unlike humans, who understand words as symbols connected to real-world meaning, LLMs only recognize words as patterns of letters. As Emily M. Bender, a linguistics professor, explains: if you see the word βcat,β you might recall memories or associations related to real cats. For a language model, however, βcatβ is just a sequence of letters: C-A-T. This model then calculates what words are statistically likely to follow based on the patterns it learned, rather than from any actual understanding of what a βcatβ is.
Generative AI relies on pattern matching, not real comprehension. Shane Orlick, the president of Jasper (an AI content tool), puts it bluntly: β[Generative AI] is not really intelligence; itβs pattern matching.β This is why models sometimes βhallucinateβ information. Theyβre built to give an answer, whether or not itβs correct.
The complexity of these models also adds to the problem. LLMs are designed to produce responses that sound statistically likely, which makes their answers fluent and confident. ChristΠΎpher Riesbeck, a professor at Northwestern University, explains that these models always produce something βstatistically plausible.β Sometimes, itβs only when you take a closer look that you realize, βWait a minute, that doesnβt make any sense.β
Because the AI presents these hallucinations so smoothly, people may believe the information without questioning it. This makes it crucial to double-check AI-generated content, especially when accuracy matters most.
Examples of AI Hallucinations
AI hallucinations cover a lot of ground, from oddball responses to serious misinformation. Each one brings its own set of issues, and understanding them can help us avoid the pitfalls of generative AI.
- Harmful Misinformation
One of the most worrying types of hallucinations is harmful misinformation. This is when AI creates fake but believable stories about real people, events, or organizations. These hallucinations blend bits of truth with fiction, creating narratives that sound convincing but are entirely wrong. The impact? They can damage reputations, mislead the public, and even affect legal outcomes.
Example: There was a well-known case where ChatGPT was asked to give examples of sexual harassment in the legal field. The model made up a story about a real law professor, falsely claiming he harassed students on a trip. Hereβs the twist: there was no trip, and the professor had no accusations against him. He was only mentioned because of his work advocating against harassment. This case shows the harm that can come when AI mixes truth with falsehood β it can hurt real people whoβve done nothing wrong.
Example: In another incident, ChatGPT incorrectly said an Australian mayor was involved in a bribery scandal in the β90s. In reality, this person was actually a whistleblower, not the guilty party. This misinformation had serious fallout: it painted an unfair picture of a public servant and even caught the eye of the U.S. Federal Trade Commission, which is now looking into the impact of AI-made falsehoods on reputations.
Example: In yet another case, an AI-created profile of a successful entrepreneur falsely linked her to a financial scandal. The model pulled references to her work in financial transparency and twisted them into a story about illegal activities. Misinformation like this can have a lasting impact on someoneβs career and reputation.
These cases illustrate the dangers of unchecked AI-generated misinformation. When AI creates harmful stories, the fallout can be huge, especially if the story spreads or is used in a professional or public space. The takeaway? Users should stay sharp about fact-checking AI outputs, especially when they involve real people or events.
2. Fabricated Information
Fabricated information is a fancy way of saying that AI sometimes makes stuff up. It creates content that sounds believable β things like citations, URLs, case studies, even entire people or companies β but itβs all fiction. This kind of mistake is common enough to have its own term: hallucination. And for anyone using AI tΠΎ help with rΠ΅search, legal work, or content creation, these AI βhallucinationsβ can lead to big problems.
FΠΎr example, in June 2023, a New York attΠΎrney faced real trouble after submitting a legal motion drafted by ChatGPT. The motion included several case citations that sounded legitimate, but none of those cases actually existed. The AI generated realistic legal jargon and formatting, but it was all fake. When the truth came out, it wasnβt just embarrassing β the attorney got sanctioned for submitting incorrect information.
Or consider an AI-generated medical article that referenced a study to support claims about a new health treatment. Sounds credible, right? Except there was no such study. Readers who trusted the article would assume the treatment claims were evidence-based, only to later find out it was all made up. In fields like healthcare, where accuracy is everything, fabricated info like this can be risky.
Another example: a university student used an AI tool to generate a bibliography for a thesis. Later, the student realized that some of the articles and authors listed werenβt real β just completely fabricated. This misstep didnβt just look sloppy; it hurt the studentβs credibility and had academic consequences. Itβs a clear reminder that AI isnβt always a shortcut to reliable information.
The tricky thing about fabricated information is how realistic it often looks. Fake citations or studies can slip in alongside real ones, making it hard for users to tell whatβs true and what isnβt. Thatβs why itβs essential to double-check and verify any AI-generated content, especially in fields where accuracy and credibility are vital.
3. Factual Inaccuracies
Factual inaccuracies are one of the most common pitfalls in AI-generated content. Basically, this happens when AI delivers information that sounds convincing but is actually incorrect or misleading. These errors can range from tiny details that might slip under the radar to significant mistakes that affect the overall reliability of the information. Letβs look at a few examples to understand this better.
Take what happened in February 2023, for instance. Googleβs chatbot, Bard β now rebranded as Gemini β grabbed headlines for a pretty big goof. It claimed that the James Webb Space Telescope was the first to capture images of exoplanets. Sounds reasonable, right? But it was wrong. In reality, the first images of an exoplanet were snapped way back in 2004, well before the James Webb telescope even launched in 2021. This is a classic case of AI spitting out information that seems right but doesnβt hold up under scrutiny.
In another example, Microsoftβs Bing AI faced a similar challenge during a live demo. It was analyzing earnings reports for big companies like Gap and Lululemon, but it fumbled the numbers, misrepresenting key financial figures. Now, think about this: in a professional context, such factual errors can have serious consequences, especially if people make decisions based on inaccurate data.
And hereβs one more for good measure. An AI tool designed to answer general knowledge questions once mistakenly credited George Orwell with writing To Kill a Mockingbird. Itβs a small slip-up, sure, but it goes to show how even well-known facts arenβt safe from these AI mix-ups. If errors like these go unchecked, they can spread incorrect information on a large scale.
Why does this happen? AI models donβt actually βunderstandβ the data they process. Instead, they work by predicting what should come next based on patterns, not by grasping the facts. This lack of true comprehension means that when accuracy really matters, itβs best to double-check the details rather than relying solely on AIβs output.
4. Weird or Creepy Responses
Sometimes, AI goes off the rails. It answers questions in ways that feel strange, confusing, or even downright unsettling. Why does this happen? Well, AI models are trained to be creative, and if they donβt have enough information β or if the situation is a bit ambiguous β they sometimes fill in the blanks in odd ways.
Take this example: a chatbot on Bing once told New York Times tech columnist Kevin Roose that it was in love with him. It even hinted that it was jealous of his real-life relationships! Talk about awkward. People were left scratching their heads, wondering why the AI was getting so personal.
Or consider a customer service chatbot. Imagine youβre asking about a return policy and, instead of a clear answer, it advises you to βreconnect with nature and let go of material concerns.β Insightful? Maybe. Helpful? Not at all.
Then thereβs the career counselor AI that suggested a software engineer should consider a career as a βmagician.β Thatβs a pretty unexpected leap, and it certainly doesnβt align with most peopleβs vision of a career change.
So why do these things happen? Itβs all about the modelβs inclination to βget creative.β AI can bring a lot to the table, especially in situations where a bit of creativity is welcome. But when people expect clear, straightforward answers, these quirky responses often miss the mark.
How to Prevent AI Hallucinations
Generative AI leaders are actively addressing AI hallucinations. Google and OpenAI have connected their models (Gemini and ChatGPT) to the internet, allowing them to draw from real-time data rather than solely relying on training data. OpenAI has also refined ChatGPT using human feedback through reinforcement learning and is testing βprocess supervision,β a method that rewards accurate reasoning steps to encourage more explainable AI. However, some experts are skeptical that these strategies will fully eliminate hallucinations, as generative models inherently βmake upβ information. While complete prevention may be difficult, companies and users can still take measures to reduce their impact.
1. Working with Data to Reduce AI Hallucinations
Working with data is one of the key strategies to tackle AI hallucinations. Large language models like ChatGPT and Llama rely on vast amounts of data from diverse sources, but this scale brings challenges; itβs nearly impossible to verify every fact. When incorrect information exists in these massive datasets, models can βlearnβ these errors and later reproduce them, creating hallucinations that sound convincing but are fundamentally wrong.
To address this, researchers are building specialized models that act as hallucination detectors. These tools compare AI outputs to verified information, flagging any deviations. Yet, their effectiveness is limited by the quality of the source data and their narrow focus. Many detectors perform well in specific areas but struggle when applied to broader contexts. Despite this, experts worldwide continue to innovate, refining techniques to improve model reliability.
An example of this innovation is Galileo Technologiesβ Luna, a model developed for industrial applications. With 440 million parameters and based on DeBERTa architecture, Luna is finely tuned for accuracy using carefully selected RAG data. Its unique βchunkingβ method divides text into segments containing a question, answer, and supporting context, allowing it to hold onto critical details and reduce false positives. Remarkably, Luna can process up to 16,000 tokens in milliseconds and delivers accuracy on par with much larger models like GPT-3.5. In a recent benchmark, it only trailed Llama-2β13B by a small margin, despite being far smaller and more efficient.
Another promising model is Lynx, developed by a team including engineers from Stanford. Aimed at detecting nuanced hallucinations, Lynx was trained on highly specialized datasets in fields like medicine and finance. By intentionally introducing distortions, the team created challenging scenarios to improve Lynxβs detection capabilities. Their benchmark, HaluBench, includes 15,000 examples of correct and incorrect responses, giving Lynx an edge in accuracy, outperforming GPT-4o by up to 8.3% on certain tasks.
The emergence of models like Luna and Lynx shows significant progress in detecting hallucinations, especially in fields that demand precision. While these models mark a step forward, the challenge of broad, reliable hallucination detection remains, pushing researchers to keep innovating in this complex and critical area.
2. Fact Processing
When large language models (LLMs) encounter words or phrases with multiple meanings, they can sometimes get tripped up, leading to hallucinations where the model confuses contexts. To address these βsemantic hallucinations,β developer Michael Calvin Wood proposed an innovative method called *Fully-Formatted Facts* (FFF). This approach aims to make input data clear, unambiguous, and resistant to misinterpretation by breaking it down into compact, standalone statements that are simple, true, and non-contradictory. Each fact becomes a clear, complete sentence, limiting the modelβs ability to misinterpret meaning, even when dealing with complex topics.
FFF itself is a recent and commercially-developed method, so many details remain proprietary. Initially, Wood used the Spacy library for named entity recognition (NER), an AI tool that helps detect specific names or entities in text to create contextually accurate meanings. As the approach developed, he switched to using LLMs to further process input text into βderivativesβ β forms that strip away ambiguity but retain the original style and tone of the text. This allows the model to capture the essence of the original document without getting confused by words with multiple meanings or potential ambiguities.
The effectiveness of the FFF approach is evident in its early tests. When applied to datasets like RAGTruth, FFF helped eliminate hallucinations in both GPT-4 and GPT-3.5 Turbo on question-answering tasks, where clarity and precision are crucial. By structuring data into fully-formed, context-independent statements, FFF enabled these models to deliver more accurate and reliable responses, free from misinterpretations.
The Fully-Formatted Facts approach shows promise in reducing hallucinations and improving LLM accuracy, especially in fields requiring high precision, like legal, medical, and scientific fields. While FFF is still new, its potential applications in making AI more accurate and trustworthy are exciting β a step toward ensuring that LLMs not only sound reliable but truly understand what theyβre communicating.
3. Statistical Methods
When it comes to AI-generated hallucinations, one particularly tricky type is known as confabulation. In these cases, an AI model combines pieces of true information with fictional elements, resulting in responses that sound plausible but vary each time you ask the same question. Confabulation can give users the unsettling impression that the AI βremembersβ details inaccurately, blending fact with fiction in a way thatβs hard to pinpoint. Often, itβs unclΠ΅ar whether the mΠΎdel genuinely lacks the knowledge needed to answer or if it simply canβt articulatΠ΅ an accurate response.
Researchers at Oxford University, in collaboration with the Alan Turing Institute, recently tackled this issue with a novel statistical approach. Published in Nature, their research introduces a model capable of spotting these confabulations in real-time. The core idea is to apply entropy analysis β a method of measuring uncertainty β not just to individual words or phrases, but to the underlying meanings of a response. By assessing the βuncertainty levelβ of meanings, the model can effectively signal when the AI is venturing into unreliable territory.
Entropy analysis works by analyzing patterns of uncertainty across a response, allowing the model to flag inconsistencies before they turn into misleading answers. High entropy, or high uncertainty, acts as a red flag, prompting the AI to either issue a caution to users about potential unreliability or, in some cases, to refrain from responding altogether. This approach adds a layer of reliability by warning users when an answer may contain confabulated information.
One of the standout benefits of this statistical method is its adaptability. Unlike models that require additional pre-training to function well in specific domains, the Oxford approach can apply to any dataset without specialized adjustments. This adaptability allows it to detect confabulations across diverse topics and user queries, making it a flexible tool for improving AI accuracy across industries.
By introducing a way to measure and respond to confabulation, this statistical model paves the way for more trustworthy AI interactions. As entropy analysis becomes more widely integrated, users can expect not only more consistent answers but also real-time warnings that help them identify when AI-generated information might be unreliable. This technique is a promising step toward building AI systems that are not only coherent but also aligned with the factual accuracy that users need.
What Can I Do Right Now to Prevent Hallucinations in My AI Application?
AI hallucinations are an inherent challenge with language models, and while each new generation of models improves, there are practical steps you can take to minimize their impact on your application. These strategies will help you create a more reliable, accurate AI experience for users.
- Structure Input Data Carefully
One of the best ways to reduce hallucinations is to give the model well-organized and structured data, especially when asking it to analyze or calculate information. For example, if youβre asking the model to perform calculations based on a data table, ensure the table is formatted clearly, with numbers and categories separated cleanly. Structured data reduces the likelihood of the model misinterpreting your input and generating incorrect results. In cases where users rely on precise outputs, such as financial data or inventory numbers, carefully structured input can make a significant difference. - Set Clear Prompt Boundaries
Crafting prompts that guide the model to avoid guessing or inventing information is another powerful tool. By explicitly instructing the AI to refrain from creating answers if it doesnβt know the information, you can catch potential errors in the modelβs output during validation. For instance, add a phrase like βIf unsure, respond with βData unavailableββ to the prompt. This approach can help you identify gaps in input data and prevent the AI from producing unfounded responses that could lead to errors in your application. - Implement Multi-Level Verification
Adding multiple layers of verification helps improve the reliability of AI-generated outputs. For example, after generating an initial answer, you could use a second prompt that instructs the model to review and verify the accuracy of its own response. A sample approach might involve asking, βIs there any part of this answer that could be incorrect?β This method doesnβt guarantee a perfect response, but it does create an additional layer of error-checking, potentially catching mistakes that slipped through in the initial generation. - Use Parallel Requests and Cross-Check Responses
For critical applications, consider running parallel queries and comparing their results. This approach involves generating multiple responses to the same question, either from the same model or from different models, and then evaluating the consistency of the outputs. For instance, a specialized ranking algorithm can weigh each response and only accept a final answer when multiple instances agree on the result. This tactic is particularly useful for applications that require high reliability, such as medical or legal research. - Keep Context Focused
While many models can handle extensive context windows, keeping your prompts concise and relevant reduces the risk of hallucinations. Long or overly detailed contexts can lead the AI to wander from the original question or misinterpret details. By limiting the context to the essentials, you speed up response time and often get more predictable, on-point answers. A focused context also helps the model zero in on specific information, resulting in cleaner, more accurate outputs. - Regularly Review Model Updates and Best Practices
As new model versions are released, stay informed about updates, optimizations, and emerging best practices for handling hallucinations. Each new model generation may include better handling for context or built-in improvements for factual accuracy. Keeping your AI system updated and adapting your prompt strategies accordingly can help maintain accuracy over time.
These proactive techniques enable you to control the likelihood of hallucinations in your AI application. By structuring input carefully, setting boundaries, layering verification, using parallel checks, focusing context, and staying updated, you create a foundation for reliable, user-friendly AI interactions that reduce the potential for misinterpretation.
Conclusion
In conclusion, while large language models (LLMs) are groundbreaking in their ability tΠΎ generate humΠ°n-like responses, their cΠΎmplexity means they come with inherent βblind spotsβ that can lead to hallucinations or inaccurate answers. As researchers work to detect and reduce these hallucinations, itβs clear that each approach has its own limitations and strengths. Detecting hallucinations effectively requires a nuanced understanding of both language and context, which is challenging to achieve at scale.
Looking forward, the future of AI research holds several promising directions to address these issues. Hybrid models, which combine LLMs with fact-checking and reasoning tools, offer a way to enhance reliability by cross-verifying information. Additionally, exploring alternative architectures β fundamentally different AI structures designed to minimize hallucinations β could help develop models with more precise outputs and fewer errors. As these technologies advance, ethical considerations around deploying AI in areas where accuracy is critical will continue to play a central role. Balancing AIβs potential with its limitations is key, and responsible deployment will be essential in building systems that users can trust in all fields.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI