Modifying Memories in Transformer Models (2012.00363v1)

Published 1 Dec 2020 in cs.CL and cs.LG

Abstract: Large Transformer models have achieved impressive performance in many natural language tasks. In particular, Transformer based LLMs have been shown to have great capabilities in encoding factual knowledge in their vast amount of parameters. While the tasks of improving the memorization and generalization of Transformers have been widely studied, it is not well known how to make transformers forget specific old facts and memorize new ones. In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}. This task is useful in many scenarios, such as updating stale knowledge, protecting privacy, and eliminating unintended biases stored in the models. We benchmarked several approaches that provide natural baseline performances on this task. This leads to the discovery of key components of a Transformer model that are especially effective for knowledge modifications. The work also provides insights into the role that different training phases (such as pretraining and fine-tuning) play towards memorization and knowledge modification.

PDF Abstract

Introduction

Transformers, a class of models introduced by Vaswani et al. (2017), have shown remarkable success in various NLP tasks and real-world applications. Their ability to implicitly memorize a vast repository of factual knowledge within their parameters is of particular interest for researchers and practitioners. The utility of Transformers in encoding, and subsequently recalling, factual knowledge poses a spectrum of applications, from QA tasks to potentially replacing traditional knowledge bases.

Knowledge Modification in Transformers

Despite the effectiveness of Transformers in learning and storing facts, there is a distinct lack of methodologies for updating or altering their stored knowledge. This gap is significant, given that more often than not, information changes over time or needs rectification. It is imperative for models to unlearn outdated facts and learn updated ones without losing performance on the rest of the retained knowledge base. This modification process involves a constrained optimization problem that ensures the model's loss on the unmodified facts remains bounded even as the desired changes are made.

Constrained Fine-tuning for Knowledge Update

The paper investigates a constrained fine-tuning technique where knowledge modification is treated as a restricted optimization issue. The method involves adjusting the model's parameters to learn new facts while minimizing interference with the existing knowledge. The findings indicate that fine-tuning only specific layers of the model, particularly the first and last Transformer blocks, can lead to better generalization and adaptation to the updated facts. This insight is consistent with previous studies that suggest different layers of Transformers capture different aspects of language representations.

Empirical Results and Implications

The authors propose benchmarks from T-REx and zsRE datasets to evaluate the ability of various models to effectively modify knowledge. The experiments conducted provide evidence that constrained fine-tuning is a successful strategy for updating specific facts while preventing catastrophic forgetting. These findings are quite significant as they imply the feasibility of adapting Transformer models to retain accuracy over unmodified knowledge while learning new or altered facts. Such an ability is crucial for models to stay relevant and accurate in dynamic and evolving data landscapes.

Conclusion

The discussed research makes a compelling case for the ability to fine-tune and modify knowledge in Transformer models. The results underline the effectiveness of constrained optimization in enforcing minimal changes to the model's weights. This controlled approach enables the preservation of unaltered knowledge while effectively updating or correcting specific facts. Extensions of this work are foreseen to explore the broader implications of modifying knowledge in neural models and finding more efficient mechanisms for accomplishing the same.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Chen Zhu (103 papers)
Ankit Singh Rawat (64 papers)
Manzil Zaheer (89 papers)
Srinadh Bhojanapalli (44 papers)
Daliang Li (28 papers)
Felix Yu (62 papers)
Sanjiv Kumar (123 papers)

Citations (165)

View on Semantic Scholar

Related Papers

Editing Factual Knowledge in Language Models (2021)
Knowledge Neurons in Pretrained Transformers (2021)
Mass-Editing Memory in a Transformer (2022)
Locating and Editing Factual Associations in GPT (2022)
Unforgettable Generalization in Language Models (2024)

Find Related Papers

YouTube

Show All Videos