Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models (2210.14328v1)

Published 25 Oct 2022 in cs.CL

Abstract: Structural probing work has found evidence for latent syntactic information in pre-trained LLMs. However, much of this analysis has focused on monolingual models, and analyses of multilingual models have employed correlational methods that are confounded by the choice of probing tasks. In this study, we causally probe multilingual LLMs (XGLM and multilingual BERT) as well as monolingual BERT-based models across various languages; we do this by performing counterfactual perturbations on neuron activations and observing the effect on models' subject-verb agreement probabilities. We observe where in the model and to what extent syntactic agreement is encoded in each language. We find significant neuron overlap across languages in autoregressive multilingual LLMs, but not masked LLMs. We also find two distinct layer-wise effect patterns and two distinct sets of neurons used for syntactic agreement, depending on whether the subject and verb are separated by other tokens. Finally, we find that behavioral analyses of LLMs are likely underestimating how sensitive masked LLMs are to syntactic information.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Aaron Mueller (35 papers)
  2. Yu Xia (65 papers)
  3. Tal Linzen (73 papers)
Citations (8)