No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models (2410.19217v1)
Abstract: Generative models have shown impressive capabilities in synthesizing high-quality outputs across various domains. However, a persistent challenge is the occurrence of "hallucinations", where the model produces outputs that are plausible but invalid. While empirical strategies have been explored to mitigate this issue, a rigorous theoretical understanding remains elusive. In this paper, we develop a theoretical framework to analyze the learnability of non-hallucinating generative models from a learning-theoretic perspective. Our results reveal that non-hallucinating learning is statistically impossible when relying solely on the training dataset, even for a hypothesis class of size two and when the entire training set is truthful. To overcome these limitations, we show that incorporating inductive biases aligned with the actual facts into the learning process is essential. We provide a systematic approach to achieve this by restricting the facts set to a concept class of finite VC-dimension and demonstrate its effectiveness under various learning paradigms. Although our findings are primarily conceptual, they represent a first step towards a principled approach to addressing hallucinations in learning generative models.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- Near-optimal sample complexity bounds for robust learning of gaussian mixtures via compression schemes. Journal of the ACM (JACM), 67(6):1–42, 2020.
- Llms will always hallucinate, and we need to live with this. arXiv preprint arXiv:2409.05746, 2024.
- The optimal approximation factor in density estimation. In Conference on Learning Theory, pp. 318–341. PMLR, 2019.
- A theory of universal learning. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pp. 532–541, 2021.
- Statistically near-optimal hypothesis selection. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 909–919. IEEE, 2022.
- On the origin of hallucinations in conversational models: Is it the datasets or the models? In Marine Carpuat, Marie-Catherine de Marneffe, and Ivan Vladimir Meza Ruiz (eds.), Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5271–5285, Seattle, United States, July 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.naacl-main.387. URL https://aclanthology.org/2022.naacl-main.387.
- Smoothed analysis with adaptive adversaries. In IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 942–953. IEEE, 2022.
- A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232, 2023.
- Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38, 2023.
- Calibrated language models must hallucinate. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pp. 160–171, 2024.
- Introduction to modern cryptography: principles and protocols. Chapman and hall/CRC, 2007.
- Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. arXiv preprint arXiv:2303.08896, 2023.
- Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730–27744, 2022.
- Information Theory: From Coding to Learning. Cambridge University Press, 2022.
- Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
- Learning functional distributions with private labels. In International Conference on Machine Learning, pp. 37728–37744. PMLR, 2023.
- Hallucination is inevitable: An innate limitation of large language models. arXiv preprint arXiv:2401.11817, 2024.
- Siren’s song in the ai ocean: a survey on hallucination in large language models. arXiv preprint arXiv:2309.01219, 2023.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.