Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models (2106.06087v3)
Abstract: Targeted syntactic evaluations have demonstrated the ability of LLMs to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplish this behavior, this study applies causal mediation analysis to pre-trained neural LLMs. We investigate the magnitude of models' preferences for grammatical inflections, as well as whether neurons process subject-verb agreement similarly across sentences with different syntactic structures. We uncover similarities and differences across architectures and model sizes -- notably, that larger models do not necessarily learn stronger preferences. We also observe two distinct mechanisms for producing subject-verb agreement depending on the syntactic structure of the input sentence. Finally, we find that LLMs rely on similar sets of neurons when given sentences with similar syntactic structure.
- Matthew Finlayson (11 papers)
- Aaron Mueller (35 papers)
- Sebastian Gehrmann (48 papers)
- Stuart Shieber (6 papers)
- Tal Linzen (73 papers)
- Yonatan Belinkov (111 papers)