Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons (2308.13198v2)

Published 25 Aug 2023 in cs.CL

Abstract: Pre-trained LLMs (PLMs) contain vast amounts of factual knowledge, but how the knowledge is stored in the parameters remains unclear. This paper delves into the complex task of understanding how factual knowledge is stored in multilingual PLMs, and introduces the Architecture-adapted Multilingual Integrated Gradients method, which successfully localizes knowledge neurons more precisely compared to current methods, and is more universal across various architectures and languages. Moreover, we conduct an in-depth exploration of knowledge neurons, leading to the following two important discoveries: (1) The discovery of Language-Independent Knowledge Neurons, which store factual knowledge in a form that transcends language. We design cross-lingual knowledge editing experiments, demonstrating that the PLMs can accomplish this task based on language-independent neurons; (2) The discovery of Degenerate Knowledge Neurons, a novel type of neuron showing that different knowledge neurons can store the same fact. Its property of functional overlap endows the PLMs with a robust mastery of factual knowledge. We design fact-checking experiments, proving that the degenerate knowledge neurons can help the PLMs to detect wrong facts. Experiments corroborate these findings, shedding light on the mechanisms of factual knowledge storage in multilingual PLMs, and contribute valuable insights to the field. The code is available at https://github.com/heng840/AMIG.

PDF HTML Abstract

Understanding the Role of Knowledge Neurons in Multilingual PLMs

In the paper titled "Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons," the authors investigate the mechanisms underlying the storage of factual knowledge within multilingual pre-trained LLMs (PLMs). The paper introduces advanced methodologies for localizing knowledge neurons and provides substantial insights into their configurations and functionalities.

The authors propose the Architecture-adapted Multilingual Integrated Gradients (AMIG) method, addressing two central challenges in knowledge localization: creating a universal method compatible with diverse PLM architectures, and extending the exploration to multilingual contexts. AMIG builds upon the traditional integrated gradients approach, introducing new baseline vectors to enhance compatibility with various architectures such as auto-encoding (e.g., BERT) and auto-regressive models (e.g., GPT). The adaptation of these vectors is pivotal in effectively identifying knowledge neurons across multiple languages, overcoming the limitations of existing methods.

Key Discoveries

Language-Independent Knowledge Neurons (LIKN): The paper identifies neurons capable of storing factual knowledge that transcends language barriers. These neurons are found by intersecting the localized knowledge neurons from different languages, indicating a universal storage mechanism within PLMs. This discovery is validated through cross-lingual knowledge editing experiments, which show that adjusting one LIKN neuron can simultaneously manipulate facts across multiple languages, enhancing the efficacy of cross-lingual editing tasks.
Degenerate Knowledge Neurons (DKN): The authors introduce the notion of degenerate knowledge neurons, which exhibit functional overlap, meaning multiple neurons can store identical factual knowledge. This principle mirrors the degeneracy observed in biological systems. The presence of DKNs contributes to the robust mastery of factual knowledge in PLMs, as evaluated by fact-checking experiments. These experiments demonstrate DKNs' utility in enhancing PLMs' stability and precision in recognizing and correcting erroneous facts.

Experimental Validation and Implications

Empirical analysis is conducted using multilingual versions of BERT and GPT models, revealing notable differences in the localization of knowledge neurons and their distributions across layers. Results indicate that LIKN and DKN enhance cross-lingual knowledge editing and fact-checking abilities, offering a framework for improved model reliability and accuracy.

The AMIG method provides improved precision in knowledge neuron localization, with a substantial increase in success rates observed in Chinese datasets.
The discovered LIKNs contribute to significant advancements in cross-lingual tasks, outperforming existing methods by streamlining redundant computational processes.
The DKN detection process, applicable even in monolingual models, highlights the inherent robustness in PLMs' factual knowledge storage, underpinning practical applications in autonomous fact-checking without reliance on external databases.

Future Prospects

This research potentially shifts the paradigm in understanding and improving PLMs by showcasing the intricate architecture of knowledge storage mechanisms. Future work could expand upon these findings to explore their implications for enhancing model interpretability and reducing biases. Additionally, the methodologies could serve as the foundation for developing innovative training schemes targeting specific architectural facets of PLMs, thereby advancing their intrinsic capabilities across diverse applications.

In conclusion, the discoveries outlined in this paper underscore significant theoretical contributions and potential practical applications by revealing nuanced understandings of knowledge neuron functionalities, meriting further exploration and development within the AI research community.

PDF Markdown Bookmark Chat (Pro)

References (39)

Authors (5)

Yuheng Chen (16 papers)
Pengfei Cao (39 papers)
Yubo Chen (58 papers)
Kang Liu (207 papers)
Jun Zhao (469 papers)

Citations (34)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - heng840/AMIG: Code of Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons (1 star)