Event-level Knowledge Editing (2402.13093v2)
Abstract: Knowledge editing aims at updating knowledge of LLMs to prevent them from becoming outdated. Existing work edits LLMs at the level of factual knowledge triplets. However, natural knowledge updates in the real world come from the occurrences of new events rather than direct changes in factual triplets. In this paper, we propose a new task setting: event-level knowledge editing, which directly edits new events into LLMs and improves over conventional triplet-level editing on (1) Efficiency. A single event edit leads to updates in multiple entailed knowledge triplets. (2) Completeness. Beyond updating factual knowledge, event-level editing also requires considering the event influences and updating LLMs' knowledge about future trends. We construct a high-quality event-level editing benchmark ELKEN, consisting of 1,515 event edits, 6,449 questions about factual knowledge, and 10,150 questions about future tendencies. We systematically evaluate the performance of various knowledge editing methods and LLMs on this benchmark. We find that ELKEN poses significant challenges to existing knowledge editing approaches. Our codes and dataset are publicly released to facilitate further research.
- Dune: Dataset for unified editing. In Proceedings of EMNLP 2023, pages 1847–1861.
- Longalign: A recipe for long context alignment of large language models. arXiv preprint arXiv:2401.18058.
- Benchmarking foundation models with language-model-as-an-examiner. In Proceedings of NeurIPs.
- Chateval: Towards better llm-based evaluators through multi-agent debate. arXiv preprint arXiv:2308.07201.
- Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
- Electra: Pre-training text encoders as discriminators rather than generators. In Proceedings of ICLR.
- Knowledge neurons in pretrained transformers. In Proceedings of ACL, pages 8493–8502.
- Editing factual knowledge in language models. In Proceedings of EMNLP, pages 6491–6506.
- Calibrating factual knowledge in pretrained language models. In Findings of EMNLP, pages 5937–5947.
- Editing common sense in transformers. In Proceedings of EMNLP, pages 8214–8232.
- Camels in a changing climate: Enhancing lm adaptation with tulu 2. arXiv preprint arXiv:2311.10702.
- Mistral 7b. arXiv preprint arXiv:2310.06825.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186.
- Zero-shot relation extraction via reading comprehension. In Proceedings of CoNLL, pages 333–342.
- Pmet: Precise model editing in a transformer. arXiv preprint arXiv:2308.08742.
- Leveraging large language models for nlg evaluation: A survey. arXiv preprint arXiv:2401.07103.
- Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81.
- An empirical study of catastrophic forgetting in large language models during continual fine-tuning. arXiv preprint arXiv:2308.08747.
- Untying the reversal curse via bidirectional language model editing. arXiv preprint arXiv:2310.10322.
- Memory-assisted prompt editing to improve GPT-3 after deployment. In Proceedings of EMNLP, pages 2833–2861.
- Memory-assisted prompt editing to improve gpt-3 after deployment. In Proceedings of EMNLP, pages 2833–2861.
- Locating and editing factual associations in gpt. In Proceedings of NeurIPs, volume 35, pages 17359–17372.
- Mass-editing memory in a transformer. In Proceedings of ICLR.
- Fast model editing at scale. In Proceedings of ICLR.
- Memory-based model editing at scale. In Proceedings of ICML, pages 15817–15831.
- Unlearnable algorithms for in-context learning. arXiv preprint arXiv:2402.00751.
- OpenAI. 2022. Introducing ChatGPT.
- OpenAI. 2023. GPT-4 technical report. arXiv preprint arXiv:2303.08774.
- Training language models to follow instructions with human feedback. In Proceedings of NeurIPs, volume 35, pages 27730–27744.
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
- Knowledge unlearning for llms: Tasks, methods, and challenges. arXiv preprint arXiv:2311.15766.
- Editable neural networks. In Proceedings of ICLR.
- Massive editing for large language models via meta learning. arXiv preprint arXiv:2311.04661.
- Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
- Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Communications of the ACM, 57(10):78–85.
- ACE 2005 multilingual training corpus. Linguistic Data Consortium, 57.
- Ben Wang and Aran Komatsuzak. 2021. Gpt-j6b: A 6 billion parameter autoregressive language model.
- Text embeddings by weakly-supervised contrastive pre-training. arXiv preprint arXiv:2212.03533.
- Knowledge editing for large language models: A survey. arXiv preprint arXiv:2310.16218.
- Maven-arg: Completing the puzzle of all-in-one event understanding dataset with event argument annotation. arXiv preprint arXiv:2311.09105.
- Maven: A massive general domain event detection dataset. In Proceedings of EMNLP, pages 1652–1671.
- Self-instruct: Aligning language models with self-generated instructions. In Proceedings of ACL, pages 13484–13508.
- Finetuned language models are zero-shot learners. In Proceedings of ICLR.
- Assessing knowledge editing in language models via relation perspective. arXiv preprint arXiv:2311.09053.
- Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771.
- Editing large language models: Problems, methods, and opportunities. In Proceedings of EMNLP.
- Can we edit factual knowledge by in-context learning? In Proceedings of EMNLP, pages 4862–4876.
- Mquake: Assessing knowledge editing in language models via multi-hop questions. In Proceedings of EMNLP, pages 15686–15702.
- Modifying memories in transformer models. arXiv preprint arXiv:2012.00363.
- Hao Peng (291 papers)
- Xiaozhi Wang (51 papers)
- Chunyang Li (19 papers)
- Kaisheng Zeng (17 papers)
- Jiangshan Duo (1 paper)
- Yixin Cao (138 papers)
- Lei Hou (127 papers)
- Juanzi Li (144 papers)