PMET: Precise Model Editing in a Transformer (2308.08742v6)
Abstract: Model editing techniques modify a minor proportion of knowledge in LLMs at a relatively low cost, which have demonstrated notable success. Existing methods assume Transformer Layer (TL) hidden states are values of key-value memories of the Feed-Forward Network (FFN). They usually optimize the TL hidden states to memorize target knowledge and use it to update the weights of the FFN in LLMs. However, the information flow of TL hidden states comes from three parts: Multi-Head Self-Attention (MHSA), FFN, and residual connections. Existing methods neglect the fact that the TL hidden states contains information not specifically required for FFN. Consequently, the performance of model editing decreases. To achieve more precise model editing, we analyze hidden states of MHSA and FFN, finding that MHSA encodes certain general knowledge extraction patterns. This implies that MHSA weights do not require updating when new knowledge is introduced. Based on above findings, we introduce PMET, which simultaneously optimizes Transformer Component (TC, namely MHSA and FFN) hidden states, while only using the optimized TC hidden states of FFN to precisely update FFN weights. Our experiments demonstrate that PMET exhibits state-of-the-art performance on both the COUNTERFACT and zsRE datasets. Our ablation experiments substantiate the effectiveness of our enhancements, further reinforcing the finding that the MHSA encodes certain general knowledge extraction patterns and indicating its storage of a small amount of factual knowledge. Our code is available at https://github.com/xpq-tech/PMET.
- GPT-NeoX-20B: An Open-Source Autoregressive Language Model. In Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models, 95–136. virtual+Dublin: Association for Computational Linguistics.
- Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1860–1874. Online: Association for Computational Linguistics.
- Evaluating the Ripple Effects of Knowledge Editing in Language Models. arXiv preprint arXiv:2307.12976.
- Editing Factual Knowledge in Language Models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 6491–6506. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.
- Dissecting Recall of Factual Associations in Auto-Regressive Language Models. arXiv:2304.14767.
- Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 30–45. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics.
- Transformer Feed-Forward Layers Are Key-Value Memories. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 5484–5495. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.
- Self-Attention Attribution: Interpreting Information Interactions Inside Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14): 12963–12971.
- How much does attention actually attend? Questioning the Importance of Attention in Pretrained Transformers. arXiv:2211.03495.
- Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 1772–1791. Online: Association for Computational Linguistics.
- Inspecting and Editing Knowledge Representations in Language Models. arXiv:2304.00740.
- Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, 55(12): 1–38.
- Feed-Forward Blocks Control Contextualization in Masked Language Models. arXiv:2302.00456.
- Revealing the Dark Secrets of BERT. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 4365–4374. Hong Kong, China: Association for Computational Linguistics.
- Zero-Shot Relation Extraction via Reading Comprehension. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), 333–342. Vancouver, Canada: Association for Computational Linguistics.
- Knowledge Graph Contrastive Learning Based on Relation-Symmetrical Structure. IEEE Transactions on Knowledge and Data Engineering, 1–12.
- Learn from relational correlations and periodic events for temporal knowledge graph reasoning. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1559–1568.
- Knowledge Consistency between Neural Networks and Beyond. arXiv:1908.01581.
- Locating and Editing Factual Associations in GPT. In Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; and Oh, A., eds., Advances in Neural Information Processing Systems, volume 35, 17359–17372. Curran Associates, Inc.
- Mass-Editing Memory in a Transformer. arXiv:2210.07229.
- Fast Model Editing at Scale. In International Conference on Learning Representations.
- Memory-Based Model Editing at Scale. arXiv:2206.06520.
- Murphy, A. H. 1996. The Finley affair: A signal event in the history of forecast verification. Weather and forecasting, 11(1): 3–20.
- Language Models as Knowledge Bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2463–2473. Hong Kong, China: Association for Computational Linguistics.
- Editable Neural Networks. arXiv:2004.00345.
- GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
- Interpretability in the wild: a circuit for indirect object identification in gpt-2 small. arXiv:2211.00593.
- Editing Large Language Models: Problems, Methods, and Opportunities. arXiv:2305.13172.
- A Survey of Large Language Models. arXiv:2303.18223.
- Can We Edit Factual Knowledge by In-Context Learning? arXiv:2305.12740.
- Modifying Memories in Transformer Models. arXiv:2012.00363.
- Xiaopeng Li (166 papers)
- Shasha Li (57 papers)
- Shezheng Song (12 papers)
- Jing Yang (320 papers)
- Jun Ma (347 papers)
- Jie Yu (98 papers)