WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models (2405.14768v3)

Published 23 May 2024 in cs.CL, cs.AI, cs.CV, cs.IR, and cs.LG

Abstract: LLMs need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (non-parametric knowledge of neural network activations/representations by retrieval) will result in an impossible triangle -- reliability, generalization, and locality can not be realized together in the lifelong editing settings. For long-term memory, directly editing the parameters will cause conflicts with irrelevant pretrained knowledge or previous edits (poor reliability and locality). For working memory, retrieval-based activations can hardly make the model understand the edits and generalize (poor generalization). Therefore, we propose WISE to bridge the gap between memories. In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge. We only edit the knowledge in the side memory and train a router to decide which memory to go through when given a query. For continual editing, we devise a knowledge-sharding mechanism where different sets of edits reside in distinct subspaces of parameters, and are subsequently merged into a shared memory without conflicts. Extensive experiments show that WISE can outperform previous model editing methods and overcome the impossible triangle under lifelong model editing of question answering, hallucination, and out-of-distribution settings across trending LLM architectures, e.g., GPT, LLaMA, and Mistral. Code is available at https://github.com/zjunlp/EasyEdit.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (9)

Peng Wang (831 papers)
Zexi Li (26 papers)
Ningyu Zhang (148 papers)
Ziwen Xu (16 papers)
Yunzhi Yao (27 papers)
Yong Jiang (194 papers)
Pengjun Xie (85 papers)
Fei Huang (408 papers)
Huajun Chen (198 papers)

Citations (11)

View on Semantic Scholar

Tweets

https://twitter.com/zxlzr/status/1837369793785319757

https://twitter.com/gm8xx8/status/1793877527636943318

https://twitter.com/arxivsanitybot/status/1794913143044354260

https://twitter.com/knishimae0531/status/1796158736299340279

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models (2405.14768v3)

Related Papers

Tweets