Introducing Duet Autopilot.
Learn more
Glossary

Hallucination detection

Hallucination detection is the process of identifying when an AI language model generates output that is factually incorrect, unsupported by its source material, or internally contradictory — commonly called an AI hallucination. A hallucination detector is a system, model, or pipeline component that evaluates AI-generated text and flags or filters responses that contain claims the underlying evidence does not support. In production AI deployments — particularly customer-facing ones — hallucination detection is a critical reliability layer that sits between the language model and the end user.

Research benchmarks give a useful baseline: large language models without grounding or detection layers hallucinate on approximately 3–8% of responses in general-domain tasks, rising to 15–25% on domain-specific queries where the model’s training data is sparse. In a customer support context handling 10,000 tickets per day, even a 3% hallucination rate means 300 incorrect responses daily — a material risk to customer trust and regulatory compliance.

How hallucination detection works

Hallucination detectors operate using several distinct techniques, often combined in a layered pipeline. The most common approach is entailment checking: a secondary model (or the same model re-prompted) evaluates whether the generated response is logically entailed by the retrieved source documents. If the response makes a claim not present in the sources, it is flagged as a hallucination. This is closely related to AI grounding — the practice of anchoring responses to verified facts — because grounded systems have a reference corpus against which entailment can be checked.

A second technique is self-consistency sampling: the model is prompted to answer the same question multiple times with varied random seeds. If the answers diverge significantly, the response is likely to be hallucinated or uncertain. A third approach is confidence scoring: some models expose log-probabilities for each generated token; low-confidence tokens in factual spans (proper nouns, numbers, dates) can be flagged for review. Finally, rule-based post-processing checks can catch specific hallucination patterns — for example, a model that generates a phone number in a format that does not match any record in the knowledge base.

Why hallucination detection matters

  • Customer trust: A single confident but incorrect response — wrong return deadline, wrong refund amount, wrong policy — can destroy a customer relationship. Detection layers intercept these before they reach the customer.
  • Regulatory compliance: In regulated industries (financial services, healthcare, insurance), a hallucinated policy statement is not just a customer experience failure — it can constitute a compliance violation. Detection and human review pipelines are a prerequisite for deployment in these verticals.
  • Model improvement: Logged hallucination events are the highest-signal training data for fine-tuning. A detection system that captures every flagged response creates a continuously growing dataset for model correction.

Hallucination detection vs. AI grounding

Hallucination detection and AI grounding (including RAG) are complementary but distinct. Grounding is a preventive measure: it supplies the model with authoritative source documents at inference time, reducing the likelihood that the model fabricates information. Hallucination detection is a corrective measure: it evaluates the output after generation and catches fabrications that slipped through. Best-practice production systems deploy both — grounding to minimize hallucination frequency, detection to catch residual failures.

A common misconception is that retrieval-augmented generation (RAG) eliminates hallucinations entirely. In practice, RAG reduces them significantly but does not eliminate them. A model can still misread a retrieved document, synthesize across documents incorrectly, or over-extrapolate beyond what the source says. This is why hallucination detection remains necessary even in fully grounded systems.

Hallucination detection in AI customer support

In AI customer support platforms, hallucination detection is typically implemented as a verification step in the response pipeline. Before a generated response is delivered to the customer, it is checked against the retrieved knowledge base articles or policy documents used to generate it. Responses that fail the entailment check are either suppressed and replaced with a safe fallback, routed to a human-in-the-loop review queue, or flagged for async quality review. The threshold for flagging is a key tuning parameter: too sensitive and the system over-routes to humans, raising cost; too lenient and hallucinations reach customers.

Production teams typically start with conservative thresholds — flagging anything below 0.85 entailment confidence — and loosen them as they build confidence in the detection model’s calibration. Tracking hallucination rate as a first-class metric alongside resolution rate and CSAT is a sign of a mature AI operations practice. The knowledge base quality directly impacts detection accuracy: well-structured, comprehensive source documents make entailment checks more reliable and reduce false-positive flags on correct but loosely phrased responses.

Deliver the concierge experiences your customers deserve

Get a demo