Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models (2403.06448v2)

Published 11 Mar 2024 in cs.CL and cs.AI

Abstract: Hallucinations in LLMs refer to the phenomenon of LLMs producing responses that are coherent yet factually inaccurate. This issue undermines the effectiveness of LLMs in practical applications, necessitating research into detecting and mitigating hallucinations of LLMs. Previous studies have mainly concentrated on post-processing techniques for hallucination detection, which tend to be computationally intensive and limited in effectiveness due to their separation from the LLM's inference process. To overcome these limitations, we introduce MIND, an unsupervised training framework that leverages the internal states of LLMs for real-time hallucination detection without requiring manual annotations. Additionally, we present HELM, a new benchmark for evaluating hallucination detection across multiple LLMs, featuring diverse LLM outputs and the internal states of LLMs during their inference process. Our experiments demonstrate that MIND outperforms existing state-of-the-art methods in hallucination detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Falcon-40B: an open large language model with state-of-the-art performance.
  2. Amos Azaria and Tom Mitchell. 2023. The internal state of an llm knows when its lying. arXiv preprint arXiv:2304.13734.
  3. Improving language models by retrieving from trillions of tokens. In International conference on machine learning, pages 2206–2240. PMLR.
  4. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  5. Thuir at wsdm cup 2023 task 1: Unbiased learning to rank. arXiv preprint arXiv:2304.12650.
  6. Web search via an efficient and effective brain-machine interface. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pages 1569–1572.
  7. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
  8. A tutorial on the cross-entropy method. Annals of operations research, 134:19–67.
  9. The Pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027.
  10. Retrieval augmented language model pre-training. In International conference on machine learning, pages 3929–3938. PMLR.
  11. Gautier Izacard and Edouard Grave. 2020. Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282.
  12. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38.
  13. Retrieval as attention: End-to-end learning of retrieval and reading within a single transformer. arXiv preprint arXiv:2212.02027.
  14. Language models (mostly) know what they know. arXiv preprint arXiv:2207.05221.
  15. Generalization through memorization: Nearest neighbor language models. arXiv preprint arXiv:1911.00172.
  16. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474.
  17. Towards better web search performance: Pre-training, fine-tuning and learning to rank. arXiv preprint arXiv:2303.04710.
  18. Thuir@ coliee 2023: Incorporating structural knowledge into pre-trained language models for legal case retrieval. arXiv preprint arXiv:2305.06812.
  19. Thuir@ coliee 2023: More parameters and legal knowledge for legal case entailment. arXiv preprint arXiv:2305.06817.
  20. Halueval: A large-scale hallucination evaluation benchmark for large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6449–6464.
  21. Truthfulqa: Measuring how models mimic human falsehoods. arXiv preprint arXiv:2109.07958.
  22. A token-level reference-free hallucination detection benchmark for free-form text generation. arXiv preprint arXiv:2104.08704.
  23. Caseencoder: A knowledge-enhanced pre-trained model for legal case encoding. arXiv preprint arXiv:2305.05393.
  24. Andrey Malinin and Mark Gales. 2020. Uncertainty estimation in autoregressive structured prediction. arXiv preprint arXiv:2002.07650.
  25. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. arXiv preprint arXiv:2303.08896.
  26. On faithfulness and factuality in abstractive summarization. arXiv preprint arXiv:2005.00661.
  27. Pointer sentinel mixture models. arXiv preprint arXiv:1609.07843.
  28. The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. arXiv preprint arXiv:2306.01116.
  29. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
  30. Named entity recognition approaches and their comparison for custom ner model. Science & Technology Libraries, 39(3):324–337.
  31. Replug: Retrieval-augmented black-box language models. arXiv preprint arXiv:2301.12652.
  32. Wikiformer: Pre-training with structured information of wikipedia for ad-hoc retrieval. arXiv preprint arXiv:2312.10661.
  33. Caseformer: Pre-training for legal case retrieval. arXiv preprint arXiv:2311.00333.
  34. Thuir2 at ntcir-16 session search (ss) task. arXiv preprint arXiv:2307.00250.
  35. Healthcare ner models using language model pretraining. arXiv preprint arXiv:1910.11241.
  36. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  37. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  38. Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
  39. Relevance feedback with brain signals. ACM Transactions on Information Systems, 42(4):1–37.
  40. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
  41. Enhancing uncertainty-based hallucination detection with stronger focus. arXiv preprint arXiv:2311.13230.
  42. Detecting hallucinated content in conditional neural sequence generation. arXiv preprint arXiv:2011.02593.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Weihang Su (27 papers)
  2. Changyue Wang (10 papers)
  3. Qingyao Ai (113 papers)
  4. Zhijing Wu (21 papers)
  5. Yujia Zhou (34 papers)
  6. Yiqun Liu (131 papers)
  7. Yiran Hu (16 papers)
Citations (19)
X Twitter Logo Streamline Icon: https://streamlinehq.com