Concept

Hallucination

Definition

When an AI model generates statements that are fluent and confident but factually incorrect, unsupported by the provided context, or entirely fabricated. In agent systems, hallucinations are especially dangerous because agents act on their outputs — a hallucinated API endpoint gets called, a hallucinated data point gets written to a database, a hallucinated recommendation gets sent to a user. Hallucination types: factual (wrong facts), citation (fake sources), logical (invalid reasoning), and confabulation (filling gaps with plausible fiction).

Builder Context

Hallucination mitigation in agents requires multiple layers: (1) grounding — always provide relevant context via RAG before asking the agent to generate factual claims; (2) structured output — constrain the agent to select from known options (enum values, database records) rather than generating freeform names; (3) verification — for high-stakes claims, have the agent verify its statements against a second source; (4) confidence calibration — teach the agent to say 'I don't know' when it doesn't have sufficient evidence. The most effective single intervention: include 'only make claims supported by the provided context' in the system prompt.