Perceptual Hallucination Overview
- Perceptual hallucination is the emergence of spurious yet plausible sensory content when systems interpolate from incomplete or ambiguous input.
- It spans biological and artificial domains, manifesting in vision-language models, generative frameworks, speech enhancement, and robotics with domain-specific evaluation metrics.
- Mitigation strategies involve grounding outputs with robust perceptual evidence, contrastive decoding, and data-centric training to reduce ungrounded predictions.
Perceptual hallucination denotes the emergence of spurious, plausible—yet incorrect—perceptual content produced by a system tasked with interpreting, generating, or acting upon sensory input. This phenomenon spans biological, cognitive, and artificial domains, including computational models for vision, language, robotics, speech enhancement, and neuroscience. The concept has evolved to encompass instantiations in both humans (classical hallucinations) and artificial systems (object, relation, or content hallucinations), with each domain offering specialized theoretical and empirical frameworks.
1. Formal Definitions and Domains of Perceptual Hallucination
Perceptual hallucination is characterized by the generation or assertion of observations (visual, auditory, tactile, etc.) unsupported by the input. In artificial systems, this typically manifests as:
- Vision–LLMs (VLMs): Outputting objects, relations, or attributes not present in the input image, or confusing spatial arrangement following perturbations (Shin et al., 6 May 2026, Park et al., 10 Jun 2025, Liu et al., 23 May 2025, Liu et al., 19 May 2026).
- World Models and Generative Models: Reconstructing plausible yet incorrect frames or scene layouts under out-of-distribution conditions (Hansen et al., 25 Jun 2026, Kim et al., 3 Dec 2025).
- Speech Enhancement: Generating speech-like artifacts, phonemes, or inflections absent in the clean reference, often due to insufficient input cues or overfit perceptual metrics (Shetu et al., 1 Jun 2026, Close et al., 2024, Rong et al., 10 Mar 2026).
- Robotics: Injecting synthetic obstacles into sensory representations to induce advantageous but "false" local environment perception (Park et al., 2022).
- Theoretical Neuroscience: The brain's over-weighting of priors or under-weighting of sensory evidence, resulting in subjective experience untethered from the environment (Barros, 4 Mar 2025, Benrimoh et al., 2023, Deistler et al., 2019, 1706.03619).
Typically, perceptual hallucination is formalized as:
- The presence of a predicted percept (object, relation, phoneme, etc.) such that is absent from the true input.
- The probability of generating ungrounded outputs: , where is the output given image and prompt (Park et al., 10 Jun 2025).
- Hallucination rates: the fraction of predicted objects/relations absent from the input, such as for relations (Shin et al., 6 May 2026), or / for object hallucinations (Chen et al., 1 May 2026).
2. Taxonomy, Causes, and Characteristic Modes
Perceptual hallucinations are decomposed into distinct causes and manifestations depending on the modality and the architecture:
Vision–Language and Multimodal Models
- Object Hallucination: Asserting objects not visible in the input; prominent in POPE, MMStar, MMBench (Park et al., 10 Jun 2025).
- Relation Hallucination: Predicting incorrect inter-object relationships, especially vulnerable to image rotation, noise, and geometric perturbations (Shin et al., 6 May 2026, Zhou et al., 28 May 2026).
- Reasoning Drift/Chain-of-Thought Overreach: Long reasoning chains lead models to depend increasingly on language priors, drifting from visual evidence; measurable by RH-AUC (Liu et al., 23 May 2025).
- Biases: Over-reliance on co-occurrence statistics, language priors, or cross-image comparison failures (e.g., missing micro-edits) (Zhou et al., 28 May 2026).
Generative and World Models
- Perceptual Hallucination: Vision tokenizer reconstructs OOD inputs to the nearest known prototype; the round-trip residual indicates off-manifold predictions (Hansen et al., 25 Jun 2026).
- Action-Marginalization/Scene-Divergence: Errors stemming from ignoring input actions or the accumulation of inaccuracies across autoregressive rollouts.
Speech Enhancement
- Linguistic Artifacts/Infilling: Generative models (GANs, flows, diffusion) may introduce spurious phonemes/words when the conditioning is weak, observable via elevated WER/CER and reduced phoneme similarity (Shetu et al., 1 Jun 2026, Close et al., 2024, Rong et al., 10 Mar 2026).
Robotics
- Synthetic Sensing: The deliberate injection of virtual obstacles into sensor data to induce safer, more efficient behaviors—a beneficial "hallucination" used to improve navigation in constrained environments (Park et al., 2022).
Biological and Theoretical Models
- Predictive Coding Imbalance: Hallucinations arise when internal priors dominate unreliable or ambiguous sensory input, modeled as an over-strong prior or reduced sensory precision in hierarchical Bayesian inference frameworks (Barros, 4 Mar 2025, Benrimoh et al., 2023).
- Homeostatic Dynamics: Generative neural networks under sensory deprivation and homeostatic drive begin to sample from prior distributions, producing internally consistent but externally untethered content (Deistler et al., 2019).
- Quantum-Theoretic Models: Hallucinations as misaligned phase evolution in a Hilbertian consciousness, not testable but illustrative of alternative formalizations (1706.03619).
3. Quantification, Metrics, and Benchmarks
Evaluation of perceptual hallucination is domain- and task-specific, with metrics engineered to isolate genuinely spurious content:
- Vision–Language:
- Object-level: 0, 1, POPE F1, Hal/Cover/Cog from AMBER-Gen (Chen et al., 1 May 2026, Park et al., 10 Jun 2025).
- Relation-level: Hallucination rate 2 and precision/recall over ground-truth and predicted relation sets, with robustness profiled under transformation families 3 (Shin et al., 6 May 2026).
- Cause-driven: Sub-cause breakdowns in tasks such as relational erasure, counterfactual attribute assertion, alteration tracing, and dense counting (ReactBench) (Zhou et al., 28 May 2026).
- Robustness profiling: Comparison of accuracy/Hallucination rates under families of perturbations (rotation, noise) and "robustness curves" (Shin et al., 6 May 2026, Park et al., 10 Jun 2025).
- World Models:
- Normalized round-trip residuals 4 and their AUROC/Spearman correlation with PSNR-based error to detect OOD scenes (Hansen et al., 25 Jun 2026).
- Speech Enhancement:
- ASR-based WER, CER; Levenshtein phoneme similarity (LPS); human MOS; hallucination power in silent segments 5 (Shetu et al., 1 Jun 2026, Close et al., 2024, Rong et al., 10 Mar 2026).
- Image Restoration:
- SHAFE (Semantic Hallucination Assessment via Feature Evaluation): Patch-wise, temperature-pooled cosine feature distances compared to PSNR, SSIM, LPIPS (Kim et al., 3 Dec 2025).
4. Mechanisms, Vulnerabilities, and Analysis Across Modalities
Vision-Language and Multimodal Systems
Perceptual hallucination emerges predominantly from a combination of:
- Insufficient visual focus: Attention mechanisms underperform when cross-modal fusion layers dilute visual signals across many tokens, especially during extended reasoning or when focusing on ambiguous, occluded, or small objects (Lu et al., 11 Oct 2025, Liu et al., 23 May 2025).
- Failure of geometric invariance: Perturbations such as rotation or noise can severely impair relational reasoning by shifting the perceived spatial arrangement, a failure mode not mirrored in human perception, which mentally "corrects" for such changes (Shin et al., 6 May 2026).
- Dominance of language priors: Pretrained LLMs or multimodal decoders fill missing or ambiguous data with statistically frequent associations from text, making the model invent plausible but visually unsupported content (Zhou et al., 28 May 2026).
- Model capacity and data coverage: Larger models and more uniform coverage of data (both at pretraining and fine-tuning) consistently reduce but do not eliminate hallucination (Liu et al., 23 May 2025, Hansen et al., 25 Jun 2026).
Generative/World Models
- The pixel-level autoencoder or tokenizer, when exposed to unseen (OOD) layouts, maps the input to the closest in-distribution representation, as detected via a large 6 (Hansen et al., 25 Jun 2026).
- Hallucination rates are high in state-action regions insufficiently covered by training data; this is addressable via task-uniform sampling or targeted collection of trajectories maximizing the internal residual signal.
Speech Enhancement
- Hallucination risk increases as SNR drops, forcing the model to rely on its generative prior. GANs and diffusion models differ in their hallucination/complexity trade-off; GANs achieve low hallucination with less data, while diffusion models offer robustness but may hallucinate more under data or conditioning weakness (Shetu et al., 1 Jun 2026).
- Non-intrusive perceptual-metric optimization can, perversely, lead to "tricked" predictors and spurious, hallucinated artifacts if not counterbalanced by reference-based losses (Close et al., 2024).
- Flow-matching modules (StuPASE) and strong semantic anchors (DeWavLM phonetic distillation) demonstrably reduce content and identity hallucinations, yielding studio-grade output with low dWER (Rong et al., 10 Mar 2026).
5. Mitigation Strategies and System Design Principles
A cross-modality synthesis reveals convergent principles for suppressing perceptual hallucination:
- Grounding outputs in perceptual evidence: Architectures must explicitly couple generated outputs to localized, robustly attended visual or sensory regions (e.g., multi-scale selective decoding, calibrated preference optimization, functional head rescaling) (Park et al., 10 Jun 2025, Zhang et al., 2 Jun 2026, Lu et al., 11 Oct 2025).
- On-policy, vision-grounded, or self-generated supervision: Mitigation is most effective when preference pairs or correction signals are generated by directly modifying the input, not merely the output text or labels (P²-DPO, OSCAR) (Zhang et al., 2 Jun 2026, Chen et al., 1 May 2026).
- Contrastive and multi-step reasoning alignment: Mechanisms such as contrastive decoding, dynamic chain-of-thought control, and adaptive attention balance the need for reasoning depth with the preservation of anchored perception (Liu et al., 23 May 2025, Lu et al., 11 Oct 2025).
- Coverage-aware and data-centric training: Ensuring uniform visitation of state-action regions or synthetic perturbations (e.g., coverage-aware sampling, data augmentation with counterfactuals) reduces OOD collapse and improves out-of-distribution faithfulness (Hansen et al., 25 Jun 2026, Zhou et al., 28 May 2026).
- Loss engineering and calibration: Composite or multi-task loss functions, combining reference-based and perceptual-metric-driven terms, guard against perceptual metric "tricks" and overfitting (Close et al., 2024, Rong et al., 10 Mar 2026).
- Explicit geometry/topology encoding: Relational and spatial predictions become robust when grounded in explicit coordinate frames, graph-invariant modules, or topological descriptors (Shin et al., 6 May 2026).
6. Benchmarks and Open Challenges
A new class of benchmarks provides robust diagnosis and catalyzes system improvements:
- Task-Dissected Benchmarks: ReactBench exposes sub-causes (co-occurrence bias, language prior, comparison blindness, counting deficiency) in exam-style tasks (Zhou et al., 28 May 2026).
- Reference Worlds: HalluWorld offers generator-independent, fully specified reference worlds for precise, automatable hallucination labeling; frontier models are now essentially free of perceptual hallucinations on directly observable facts, but failures persist in memory and causality tasks (Liu et al., 19 May 2026).
- Synthetic Hallucination Generation: HalluGen enables the principled study of hallucination detectors and metrics, challenging conventional pixel-based quality measures (e.g. PSNR, LPIPS) with adversarially synthesized artifacts (Kim et al., 3 Dec 2025).
- Dynamic Metrics: RH-AUC captures the trade-off between reasoning chain length and perceptual fidelity, providing a single metric for reasoning-enabled multimodal models (Liu et al., 23 May 2025).
Notably, addressing perceptual hallucination is increasingly viewed as a solvable subproblem—frontier models approach ceiling performance on controlled perceptual probes, shifting research emphasis toward memory, multi-step reasoning, and causal inference (Liu et al., 19 May 2026). Nevertheless, in more brittle domains (e.g., medical restoration, open-world navigation, extreme SNR speech), substantial risk remains.
7. Theoretical and Cognitive Implications
The convergence of biological, cognitive, and artificial frameworks underlines shared architectures of inference and prediction:
- Predictive Coding and Bayesian Inference: Perceptual hallucinations in humans result from strong priors outweighing unreliable evidence, captured by hierarchical Gaussian filters and variational free-energy minimization (Barros, 4 Mar 2025, Benrimoh et al., 2023).
- Homeostatic Generative Dynamics: Biological and artificial neural networks, when deprived of bottom-up input, sample from learned priors, elucidating the conditions under which hallucinations arise as a feature of generative architectures (Deistler et al., 2019).
- Artificial Systems: Hallucinations emerge naturally from any predictive system—human or machine—that must interpolate or extrapolate from incomplete input, balancing creativity and reliability. Systematic mitigation (e.g., retrieval grounding, confidence calibration, externally supervised feedback) aims to tether generative output to external evidence without extinguishing adaptive flexibility (Barros, 4 Mar 2025, Zhang et al., 2 Jun 2026, Chen et al., 1 May 2026).
A plausible implication is that further advances in artificial perception may benefit not just from architectural scaling but from deeper integration with uncertainty quantification, explicit world-modeling, and attention calibration—mirroring principles that underlie the flexible but bounded cognition observed in biological systems.