Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DeepEdit: Knowledge Editing as Decoding with Constraints (2401.10471v5)

Published 19 Jan 2024 in cs.CL and cs.AI

Abstract: How to edit the knowledge in multi-step reasoning has become the major challenge in the knowledge editing (KE) of LLMs. The difficulty arises because the hallucinations of LLMs during multi-step reasoning often lead to incorrect use of new knowledge and incorrect answers. To address this issue, we design decoding constraints to "regulate" LLMs' reasoning, enhancing logical coherence when incorporating new knowledge. We propose a new KE framework: DEEPEDIT (Depth-first Search-based Constrained Decoding for Knowledge Editing), which enhances LLMs' ability to generate coherent reasoning chains with new knowledge through depth-first search. Our search selects the most important knowledge that satisfies our constraints as the reasoning step to efficiently increase the reasoning depth. In addition to DEEPEDIT, we propose two new KE benchmarks: MQUAKE-2002 and MQUAKE-HARD, which provide more precise and challenging assessments of KE approaches. Qualitatively, DEEPEDIT enables LLMs to produce succinct and coherent reasoning chains involving new knowledge. Quantitatively, it yields significant improvements on multiple KE benchmarks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Evaluating the ripple effects of knowledge editing in language models. arXiv preprint arXiv:2307.12976.
  2. Knowledge neurons in pretrained transformers. arXiv preprint arXiv:2104.08696.
  3. Editing factual knowledge in language models. arXiv preprint arXiv:2104.08164.
  4. Calibrating factual knowledge in pretrained language models. arXiv preprint arXiv:2210.03329.
  5. Flexible grammar-based constrained decoding for language models. arXiv preprint arXiv:2305.13971.
  6. Do language models have beliefs? methods for detecting, updating, and visualizing model beliefs. arXiv preprint arXiv:2111.13654.
  7. Transformer-patcher: One mistake worth one neuron. arXiv preprint arXiv:2301.09785.
  8. Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive nlp. arXiv preprint arXiv:2212.14024.
  9. Gedi: Generative discriminator guided sequence generation. arXiv preprint arXiv:2009.06367.
  10. Gradient-based constrained sampling from language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2251–2277.
  11. Dexperts: Decoding-time controlled text generation with experts and anti-experts. arXiv preprint arXiv:2105.03023.
  12. Bolt: Fast energy-based controlled text generation with tunable biases. arXiv preprint arXiv:2305.12018.
  13. Neurologic a* esque decoding: Constrained text generation with lookahead heuristics. arXiv preprint arXiv:2112.08726.
  14. Neurologic decoding:(un) supervised neural text generation with predicate logic constraints. arXiv preprint arXiv:2010.12884.
  15. Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651.
  16. Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35:17359–17372.
  17. Mass-editing memory in a transformer. arXiv preprint arXiv:2210.07229.
  18. Controllable text generation with neurally-decomposed oracle. Advances in Neural Information Processing Systems, 35:28125–28139.
  19. Fast model editing at scale. arXiv preprint arXiv:2110.11309.
  20. Memory-based model editing at scale. In International Conference on Machine Learning, pages 15817–15831. PMLR.
  21. Enhancing self-consistency and performance of pre-trained language models through natural language inference. arXiv preprint arXiv:2211.11875.
  22. OpenAI. 2022. Introducing chatgpt https://openai.com/blog/chatgpt).
  23. Measuring and narrowing the compositionality gap in language models. arXiv preprint arXiv:2210.03350.
  24. Back to the future: Unsupervised backprop-based decoding for counterfactual and abductive commonsense reasoning. arXiv preprint arXiv:2010.05906.
  25. Cold decoding: Energy-based constrained text generation with langevin dynamics. Advances in Neural Information Processing Systems, 35:9538–9551.
  26. Screws: A modular framework for reasoning with revisions. arXiv preprint arXiv:2309.13075.
  27. The art of llm refinement: Ask, refine, and trust. arXiv preprint arXiv:2311.07961.
  28. Correcting deep neural networks with small, generalizing patches. In Workshop on safety and robustness in decision making.
  29. Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 57(10):78–85.
  30. Generating sequences by learning to self-correct. arXiv preprint arXiv:2211.00053.
  31. Kevin Yang and Dan Klein. 2021. Fudge: Controlled text generation with future discriminators. arXiv preprint arXiv:2104.05218.
  32. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
  33. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
  34. Position-aware attention and supervised data improve slot filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 35–45.
  35. A survey of large language models. arXiv preprint arXiv:2303.18223.
  36. Progressive-hint prompting improves reasoning in large language models. arXiv preprint arXiv:2304.09797.
  37. Mquake: Assessing knowledge editing in language models via multi-hop questions. arXiv preprint arXiv:2305.14795.
  38. Context-faithful prompting for large language models. arXiv preprint arXiv:2303.11315.
  39. Modifying memories in transformer models. arXiv preprint arXiv:2012.00363.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yiwei Wang (119 papers)
  2. Muhao Chen (159 papers)
  3. Nanyun Peng (205 papers)
  4. Kai-Wei Chang (292 papers)
Citations (19)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets