Papers
Topics
Authors
Recent
Search
2000 character limit reached

Knowledge Verification to Nip Hallucination in the Bud

Published 19 Jan 2024 in cs.CL | (2401.10768v5)

Abstract: While LLMs have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as hallucination. In this paper, we demonstrate the feasibility of mitigating hallucinations by verifying and minimizing the inconsistency between external knowledge present in the alignment data and the intrinsic knowledge embedded within foundation LLMs. Specifically, we propose a novel approach called Knowledge Consistent Alignment (KCA), which employs a well-aligned LLM to automatically formulate assessments based on external knowledge to evaluate the knowledge boundaries of foundation LLMs. To address knowledge inconsistencies in the alignment data, KCA implements several specific strategies to deal with these data instances. We demonstrate the superior efficacy of KCA in reducing hallucinations across six benchmarks, utilizing foundation LLMs of varying backbones and scales. This confirms the effectiveness of mitigating hallucinations by reducing knowledge inconsistency. Our code, model weights, and data are openly accessible at \url{https://github.com/fanqiwan/KCA}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Ms marco: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268.
  2. Pythia: A suite for analyzing large language models across training and scaling. In International Conference on Machine Learning, pages 2397–2430. PMLR.
  3. Alpagasus: Training a better alpaca with fewer data. arXiv preprint arXiv:2307.08701.
  4. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
  5. Dola: Decoding by contrasting layers improves factuality in large language models. arXiv preprint arXiv:2309.03883.
  6. Chain-of-verification reduces hallucination in large language models. arXiv preprint arXiv:2309.11495.
  7. A survey for in-context learning. arXiv preprint arXiv:2301.00234.
  8. Truthful ai: Developing and governing ai that does not lie. arXiv preprint arXiv:2110.06674.
  9. Mitigating large language model hallucinations via autonomous knowledge graph-based retrofitting. arXiv preprint arXiv:2311.13314.
  10. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300.
  11. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232.
  12. Mistral 7b. arXiv preprint arXiv:2310.06825.
  13. Teaching language models to hallucinate less with synthetic tasks. arXiv preprint arXiv:2310.06827.
  14. Factuality enhanced language models for open-ended text generation. Advances in Neural Information Processing Systems, 35:34586–34599.
  15. One shot learning as instruction data prospector for large language models. arXiv preprint arXiv:2312.10302.
  16. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
  17. Truthfulqa: Measuring how models mimic human falsehoods. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3214–3252.
  18. What makes good data for alignment? a comprehensive study of automatic data selection in instruction tuning. arXiv preprint arXiv:2312.15685.
  19. A pretrainer’s guide to training data: Measuring the effects of data age, domain coverage, quality, & toxicity. arXiv preprint arXiv:2305.13169.
  20. # instag: Instruction tagging for analyzing supervised fine-tuning of large language models. arXiv e-prints, pages arXiv–2308.
  21. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  22. Language models as knowledge bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2463–2473.
  23. Aligning large multimodal models with factually augmented rlhf. arXiv preprint arXiv:2309.14525.
  24. Principle-driven self-alignment of language models from scratch with minimal human supervision. arXiv preprint arXiv:2305.03047.
  25. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  26. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  27. Explore-instruct: Enhancing domain-specific instruction coverage through active exploration. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 9435–9454.
  28. Secrets of rlhf in large language models part ii: Reward modeling. arXiv preprint arXiv:2401.06080.
  29. Can generative pre-trained language models serve as knowledge bases for closed-book qa? In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3241–3251.
  30. Openchat: Advancing open-source language models with mixed-quality data. arXiv preprint arXiv:2309.11235.
  31. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Dyiemonstrations, pages 38–45.
  32. Wizardlm: Empowering large language models to follow complex instructions. arXiv preprint arXiv:2304.12244.
  33. Alignment for honesty. arXiv preprint arXiv:2312.07000.
  34. Aci-bench: a novel ambient clinical intelligence dataset for benchmarking automatic visit note generation. Scientific Data, 10(1):586.
  35. R-tuning: Teaching large language models to refuse unknown questions. arXiv preprint arXiv:2311.09677.
  36. Siren’s song in the ai ocean: A survey on hallucination in large language models. arXiv preprint arXiv:2309.01219.
  37. Judging llm-as-a-judge with mt-bench and chatbot arena. arXiv preprint arXiv:2306.05685.
  38. Lima: Less is more for alignment. arXiv preprint arXiv:2305.11206.
  39. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593.
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.