Hallucinations — and how to catch them

Hallucinations are not a bug to be fixed. They are the natural output of a system optimized to produce plausible next tokens. This lesson teaches you why they happen, how to spot them, and — just as important — when you can relax about them.

Why models hallucinate

Remember Lesson 1: the model predicts the next token based on patterns in its training data. Here is the key insight — the training data contains truths and untruths mixed together. Wikipedia articles sit alongside forum posts with bad advice. Accurate medical studies share the dataset with blog posts that misinterpret them. The model learned all of it, and it has no internal mechanism to label which patterns correspond to facts and which correspond to plausible-sounding fiction.

When you ask a model a question, it does not retrieve a fact from a database. It completes a pattern. If the most likely completion is factually correct, great. But if the most likely completion is something that sounds authoritative and plausible while being wrong, the model will produce it with the same confidence. It has no concept of "I am not sure about this."

This is why hallucinations tend to be convincing. The model is not generating random nonsense. It is generating text that fits the pattern of what a correct, authoritative answer looks like. Fake academic papers get real-sounding author names, plausible journal titles, and reasonable-sounding conclusions. Fake API documentation includes correctly formatted code samples that call methods that do not exist.

The confidence problem

One of the most dangerous aspects of hallucinations is that the model's tone does not change. A model states a fabricated statistic with the same confident, helpful tone it uses for a well-established fact. There is no stutter, no hedge, no "I think" that reliably signals uncertainty.

Some models have gotten better at expressing uncertainty — saying "I'm not entirely sure" or "you may want to verify this." But this is trained behavior, not true calibration. The model learned that sometimes it should add uncertainty language, but it cannot reliably determine when. It may hedge on something it actually knows well, and speak confidently about something it is inventing.

The practical takeaway: you cannot use the model's confidence level as a proxy for accuracy. You need a different strategy.

Three detection moves

Here are three reliable moves you can use immediately to catch hallucinations:

Move 1: Ask for sources, then verify them. If the model cites a paper, a URL, a law, or a specification, check it independently. Do not ask the model "is this source real?" — it will often confirm its own hallucination. Open a browser and verify. This single habit catches the majority of dangerous hallucinations.

Move 2: Cross-check key claims. If the model states a specific fact — a number, a date, a technical specification — verify it against a second source. You do not need to verify every sentence. Focus on the claims that matter: the ones you will base decisions on.

Move 3: Notice "confident specificity" you did not earn. This is the subtlest detection skill. When a model gives you very specific details that you did not provide and did not ask about — exact percentages, precise dates, named individuals — treat that as a signal to verify. Real knowledge tends to come with context and qualifications. Hallucinated knowledge tends to be surprisingly specific and clean.

When verification is mandatory

Not all outputs carry the same hallucination risk. Here is a practical framework:

Always verify:

Facts, statistics, and numbers
URLs and links
Academic citations and references
Legal or medical claims
Dates of specific events
Names of real people paired with specific claims
API documentation and technical specifications

Verification optional (lower risk):

Creative writing and brainstorming
Code structure and architecture suggestions (you will test them anyway)
Explanations of well-known concepts
Formatting and restructuring of content you provided
Summaries of text you supplied in the prompt

The distinction is simple: if the output depends on the model "knowing" an external fact, verify. If the output depends on the model's ability to reason about or transform information you provided, the hallucination risk is much lower.

Reducing hallucination risk

You can also reduce the likelihood of hallucinations through how you prompt:

Provide the source material. Instead of asking "What does the React documentation say about hooks?", paste the relevant documentation and ask the model to work from it. The model halluccinates less when it can pattern-match against provided text rather than rely on training memory.
Ask the model to say when it does not know. Adding "If you are not sure, say so rather than guessing" to your system prompt does help, though it is not foolproof.
Break complex questions into verifiable steps. Instead of "Analyze this company's financials and give me a recommendation," ask for specific, checkable outputs: "List the revenue numbers from this report" (verifiable), then "What trends do you see?" (interpretive, lower risk).

The right mental model

Do not think of AI hallucination as a flaw that will eventually be fixed. Think of it as an inherent property of how these systems work. They will get better — they already have — but a system that predicts plausible text will always sometimes produce plausible-but-wrong text. The question is not "does this model hallucinate?" The question is "what is my verification strategy for this use case?"

The best AI practitioners are not the ones who never encounter hallucinations. They are the ones who have built habits and systems that catch them before they cause damage.

What is next

Now that you understand the trust landscape — what models get right and where they invent — we need a different kind of control. The next lesson covers temperature and sampling: the knobs that let you dial the model's creativity up or down, and why the right setting depends entirely on what you are trying to do.

لماذا يهلوس النّموذج وكيف تكشف ذلك

النّموذج اللّغويّ لا يسترجع حقائق من قاعدة بيانات. إنّه يُكمل أنماطًا تعلّمها من بيانات تدريب تمزج الصّحيح بالخاطئ دون تمييز. حين يكون الإكمال الأكثر احتمالاً خاطئًا واقعيًّا لكنّه يبدو مقنعًا، يولّده النّموذج بنفس الثّقة التي يولّد بها حقيقة مؤكّدة. لذلك تبدو الهلوسات مقنعة: مراجع أكاديميّة بأسماء مؤلّفين واقعيّة، إحصاءات بنسب دقيقة، روابط بصيغ صحيحة — لكنّها جميعًا مختلقة.

للكشف ثلاث حركات موثوقة: أوّلاً، اطلب المصادر ثمّ تحقّق منها بنفسك (لا تسأل النّموذج هل مصدره حقيقيّ). ثانيًا، تحقّق من الادّعاءات الأساسيّة عبر مصدر ثانٍ. ثالثًا، انتبه حين يقدّم النّموذج تفاصيل بالغة الدّقّة لم تطلبها — هذا إشارة تحذير. التّحقّق إلزاميّ للحقائق والأرقام والرّوابط والتّواريخ والمراجع. أمّا في الكتابة الإبداعيّة والعصف الذّهني فالمخاطرة أقلّ بكثير.

الهلوسة ليست عيبًا سيُصلَح بالكامل يومًا ما. إنّها خاصّيّة طبيعيّة لنظام يتنبّأ بالنّصّ المعقول. السّؤال ليس "هل يهلوس هذا النّموذج؟" بل "ما استراتيجيّة التّحقّق لحالة الاستعمال هذه؟"

Facts, statistics, and numbers
URLs and links
Academic citations and references
Legal or medical claims
Dates of specific events
Names of real people paired with specific claims
API documentation and technical specifications

Verification optional (lower risk):

Creative writing and brainstorming
Code structure and architecture suggestions (you will test them anyway)
Explanations of well-known concepts
Formatting and restructuring of content you provided
Summaries of text you supplied in the prompt

Reducing hallucination risk

You can also reduce the likelihood of hallucinations through how you prompt:

Provide the source material. Instead of asking "What does the React documentation say about hooks?", paste the relevant documentation and ask the model to work from it. The model halluccinates less when it can pattern-match against provided text rather than rely on training memory.
Ask the model to say when it does not know. Adding "If you are not sure, say so rather than guessing" to your system prompt does help, though it is not foolproof.
Break complex questions into verifiable steps. Instead of "Analyze this company's financials and give me a recommendation," ask for specific, checkable outputs: "List the revenue numbers from this report" (verifiable), then "What trends do you see?" (interpretive, lower risk).

Hallucinations — and how to catch them

What you'll learn

Why models hallucinate

The confidence problem

Three detection moves

When verification is mandatory

Reducing hallucination risk

The right mental model

What is next

لماذا يهلوس النّموذج وكيف تكشف ذلك

Try it yourself

Reflect

Hallucinations — and how to catch them

What you'll learn

Why models hallucinate

The confidence problem

Three detection moves

When verification is mandatory

Reducing hallucination risk

The right mental model

What is next

لماذا يهلوس النّموذج وكيف تكشف ذلك

Try it yourself

Reflect