Linguistically Grounded Analysis of Language Models using Shapley Head Values

Published 17 Oct 2024 in cs.CL | (2410.13396v2)

Abstract: Understanding how linguistic knowledge is encoded in LLMs is crucial for improving their generalisation capabilities. In this paper, we investigate the processing of morphosyntactic phenomena, by leveraging a recently proposed method for probing LLMs via Shapley Head Values (SHVs). Using the English language BLiMP dataset, we test our approach on two widely used models, BERT and RoBERTa, and compare how linguistic constructions such as anaphor agreement and filler-gap dependencies are handled. Through quantitative pruning and qualitative clustering analysis, we demonstrate that attention heads responsible for processing related linguistic phenomena cluster together. Our results show that SHV-based attributions reveal distinct patterns across both models, providing insights into how LLMs organize and process linguistic information. These findings support the hypothesis that LLMs learn subnetworks corresponding to linguistic theory, with potential implications for cross-linguistic model analysis and interpretability in NLP.

Abstract PDF HTML Upgrade to Chat

Summary

The paper introduces a Shapley-based method to quantify the contribution of each attention head to morphosyntactic tasks.
The paper demonstrates that attention heads cluster into consistent linguistic subnetworks, with six out of ten clusters robust across BERT and RoBERTa.
The paper's pruning experiments reveal that removing key attention heads significantly impairs accuracy, underscoring their localized importance.

Linguistically Grounded Analysis of LLMs using Shapley Head Values

The study entitled "Linguistically Grounded Analysis of LLMs using Shapley Head Values" presents a methodical investigation into the encoding of morphosyntactic phenomena within LLMs. The paper centers on two prevalent models, BERT and RoBERTa, employing Shapley Head Values (SHVs) to probe the roles of attention heads in processing linguistic information.

Objectives and Methods

The researchers use the BLiMP dataset, focusing on morphosyntactic constructions like anaphor agreement and filler-gap dependencies. The main contribution is the use of SHVs for attributing the importance of attention heads, a method adapted from game theory, to evaluate how these components contribute to linguistic tasks. This involves calculating SHVs to identify and cluster attention heads based on their relative contributions, supplemented by qualitative linguistic analysis and quantitative pruning.

Key Findings

Linguistic Subnetworks: The results indicate that attention heads responsible for similar linguistic phenomena tend to cluster, supporting the hypothesis of underlying subnetworks aligned with linguistic theories.
Cluster Consistency: Six of ten clusters were consistent across models, indicating model dependence on subnetworks for related linguistic phenomena. For instance, clusters associated with NPI licensing and Binding paradigms matched considerably across both models.
Localized Head Importance: Pruning experiments revealed that certain vital attention heads, when removed, substantially impacted accuracy, demonstrating their localized importance within the models' subnetworks for specific phenomena.
Model Sensitivity: RoBERTa's larger training size and capacity lend it more discerning cluster patterns compared to BERT, suggesting quantity and quality of data influence the generalization of morphosyntactic knowledge within models.

Implications and Future Research Directions

The examination provides insights into LLMs' organization at a granular level, highlighting how models might systematically encode linguistic knowledge through subnetworks aligned with specific morphosyntactic tasks. The implications for interpretation in NLP are substantial, as the work facilitates cross-linguistic model analysis.

Future research could extend this methodology to multilingual models and datasets, exploring differences across languages and testing whether similar subnetwork structures emerge universally. Additionally, a deeper dive into neuron-level attributions could refine our understanding of subnetwork granularity and its influence on linguistic processing.

In sum, this paper contributes a precise methodology that decodes the layered complexity of model-based linguistic knowledge encoding, offering a pathway for more interpretable AI systems in the field of NLP.

Markdown