Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decoding by Contrasting Knowledge: Enhancing LLMs' Confidence on Edited Facts (2405.11613v2)

Published 19 May 2024 in cs.CL

Abstract: The knowledge within LLMs may become outdated quickly. While in-context editing (ICE) is currently the most effective method for knowledge editing (KE), it is constrained by the black-box modeling of LLMs and thus lacks interpretability. Our work aims to elucidate the superior performance of ICE on the KE by analyzing the impacts of in-context new knowledge on token-wise distributions. We observe that despite a significant boost in logits of the new knowledge, the performance of is still hindered by stubborn knowledge. Stubborn knowledge refers to as facts that have gained excessive confidence during pretraining, making it hard to edit effectively. To address this issue and further enhance the performance of ICE, we propose a novel approach termed $\textbf{De}$coding by $\textbf{C}$ontrasting $\textbf{K}$nowledge (DeCK). DeCK derives the distribution of the next token by contrasting the logits obtained from the newly edited knowledge guided by ICE with those from the unedited parametric knowledge. Our experiments consistently demonstrate that DeCK enhances the confidence of LLMs in edited facts. For instance, it improves the performance of LLaMA3-8B-instruct on MQuAKE by up to 219%, demonstrating its capability to strengthen ICE in the editing of stubborn knowledge. Our work paves the way to develop the both effective and accountable KE methods for LLMs. (The source code is available at: https://deck-LLM.meirtz.com)

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Is factuality decoding a free lunch for llms? evaluation on knowledge editing benchmark. arXiv preprint arXiv:2404.00216.
  2. Combating misinformation in the age of llms: Opportunities and challenges. arXiv preprint arXiv:2311.05656.
  3. Dola: Decoding by contrasting layers improves factuality in large language models. arXiv preprint arXiv:2309.03883.
  4. Evaluating the ripple effects of knowledge editing in language models. Transactions of the Association for Computational Linguistics, 12:283–298.
  5. Editing factual knowledge in language models. arXiv preprint arXiv:2104.08164.
  6. Retrieval-generation synergy augmented large language models. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 11661–11665. IEEE.
  7. A survey of knowledge enhanced pre-trained language models. IEEE Transactions on Knowledge and Data Engineering.
  8. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.
  9. Look before you leap: An exploratory study of uncertainty measurement for large language models. arXiv preprint arXiv:2307.10236.
  10. Mistral 7b. arXiv preprint arXiv:2310.06825.
  11. Comparing hallucination detection metrics for multilingual generation. arXiv preprint arXiv:2402.10496.
  12. Contrastive decoding: Open-ended text generation as optimization. In Rogers, A., Boyd-Graber, J., and Okazaki, N., editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12286–12312, Toronto, Canada. Association for Computational Linguistics.
  13. Understanding and patching compositional reasoning in llms. arXiv preprint arXiv:2402.14328.
  14. Memory-assisted prompt editing to improve gpt-3 after deployment. arXiv preprint arXiv:2201.06009.
  15. Slang: New concept comprehension of large language models.
  16. Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35:17359–17372.
  17. Mass-editing memory in a transformer. arXiv preprint arXiv:2210.07229.
  18. Memory-based model editing at scale. In International Conference on Machine Learning, pages 15817–15831. PMLR.
  19. OpenAI (2022). large-scale generative pre-training model for conversation. OpenAI blog.
  20. OpenAI (2023). Gpt-4 technical report.
  21. Unifying large language models and knowledge graphs: A roadmap. IEEE Transactions on Knowledge and Data Engineering.
  22. Retrieval-enhanced knowledge editing for multi-hop question answering in language models. arXiv preprint arXiv:2403.19631.
  23. Editable neural networks. arXiv preprint arXiv:2004.00345.
  24. Fmint: Bridging human designed and data pretrained models for differential equation foundation model. arXiv preprint arXiv:2404.14688.
  25. Llama: Open and efficient foundation language models. CoRR, abs/2302.13971.
  26. Llama 2: Open foundation and fine-tuned chat models.
  27. Easyedit: An easy-to-use knowledge editing framework for large language models. arXiv preprint arXiv:2308.07269.
  28. Deepedit: Knowledge editing as decoding with constraints. arXiv preprint arXiv:2401.10471.
  29. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837.
  30. Editing factual knowledge and explanatory ability of medical large language models. arXiv preprint arXiv:2402.18099.
  31. Kg-rank: Enhancing large language models for medical qa with knowledge graphs and ranking techniques. arXiv preprint arXiv:2403.05881.
  32. Editing large language models: Problems, methods, and opportunities. arXiv preprint arXiv:2305.13172.
  33. Alleviating hallucinations of large language models through induced hallucinations. arXiv preprint arXiv:2312.15710.
  34. Siren’s song in the ai ocean: A survey on hallucination in large language models.
  35. Can we edit factual knowledge by in-context learning? arXiv preprint arXiv:2305.12740.
  36. Mquake: Assessing knowledge editing in language models via multi-hop questions. arXiv preprint arXiv:2305.14795.
  37. Modifying memories in transformer models. arXiv preprint arXiv:2012.00363.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Baolong Bi (23 papers)
  2. Shenghua Liu (33 papers)
  3. Lingrui Mei (20 papers)
  4. Yiwei Wang (119 papers)
  5. Pengliang Ji (14 papers)
  6. Xueqi Cheng (274 papers)
Citations (18)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets