Do Large Language Models know who did what to whom? (2504.16884v2)

Published 23 Apr 2025 in cs.CL

Abstract: LLMs are commonly criticized for not understanding language. However, many critiques focus on cognitive abilities that, in humans, are distinct from language processing. Here, we instead study a kind of understanding tightly linked to language: inferring who did what to whom (thematic roles) in a sentence. Does the central training objective of LLMs-word prediction-result in sentence representations that capture thematic roles? In two experiments, we characterized sentence representations in four LLMs. In contrast to human similarity judgments, in LLMs the overall representational similarity of sentence pairs reflected syntactic similarity but not whether their agent and patient assignments were identical vs. reversed. Furthermore, we found little evidence that thematic role information was available in any subset of hidden units. However, some attention heads robustly captured thematic roles, independently of syntax. Therefore, LLMs can extract thematic roles but, relative to humans, this information influences their representations more weakly.

Summary

The paper shows that LLMs can extract thematic role information but their overall embeddings prioritize syntax over semantic roles unlike human comprehension.
The paper utilizes internal probing techniques to reveal that specific attention heads capture thematic roles more accurately than the aggregated hidden states.
The study implies that focusing on attention mechanisms rather than standard embeddings may improve semantic tasks like information extraction and relation analysis.

This paper (2504.16884) investigates whether LLMs trained solely on the word prediction objective are capable of understanding "who did what to whom" in a sentence, also known as thematic role assignment. It contrasts LLMs' internal representations with human judgments, focusing on how syntax and thematic roles influence sentence representations. The paper employs internal probing techniques rather than direct prompting, arguing that probing reveals underlying competence without relying on meta-linguistic abilities or potential prompt-based heuristics.

The core finding is that while LLMs can extract thematic role information, their overall sentence representations are more strongly influenced by syntax than by thematic roles, which is the opposite of human behavior. However, the paper also reveals that thematic role information is robustly captured in specific attention heads, suggesting a difference in how this information is represented compared to humans and where it might be best accessed within the model architecture.

Here's a breakdown of the experiments and their practical implications:

Experiment 1: Representational Similarity of Simple Reversible Sentences

Objective: To compare how thematic roles and syntax influence the overall sentence representations (distributed activity patterns across hidden units) in pre-trained LLMs (BERT, GPT2-Small, Llama2-7B, Persimmon-8B) compared to human similarity judgments.
Methodology: Researchers created sets of simple, reversible sentences (e.g., "the tiger punched the panther") and variations that either maintained semantics (thematic roles) while changing syntax (passive voice) or changed semantics (swapped agent/patient) while maintaining syntax. They extracted sentence representations from the final hidden layer ([CLS] token for BERT, '.' token for others) and computed cosine similarities between these representations. These similarities were compared across conditions (same semantics/same syntax, same semantics/different syntax, different semantics/same syntax, different semantics/different syntax). Human participants provided similarity ratings for the same sentence pairs.
Key Findings:
- For all tested LLMs, sentence pairs with the same syntax but different thematic roles (e.g., "the tiger punched the panther" vs. "the panther punched the tiger") were represented as more similar than sentence pairs with the same thematic roles but different syntax (e.g., "the tiger punched the panther" vs. "the panther was punched by the tiger").
- This pattern indicates that syntax had a stronger influence on the overall sentence representations in these LLMs than thematic roles.
- Human similarity judgments showed the opposite pattern: pairs with the same thematic roles were rated as more similar, regardless of syntax, demonstrating a stronger influence of thematic roles on human comprehension.
Practical Implications: If you are using the overall sentence embedding (e.g., from the [CLS] token or average pooling) from a standard pre-trained transformer model in an application where understanding "who did what to whom" is critical (e.g., information extraction, semantic search, question answering), be aware that these embeddings might prioritize syntactic structure over the correct assignment of agent and patient, potentially leading to errors on sentences with different grammatical structures but similar meaning (like active/passive pairs). Simply comparing cosine similarity of pooled sentence embeddings might not reliably capture semantic similarity related to thematic roles in a human-like way.

Experiment 2: Localizing Thematic Role Information in Complex Structures

Objective: To determine if thematic role information is present anywhere within the LLM (even in a subset of hidden units or in attention heads), particularly using more syntactically complex ditransitive sentences.
Methodology:
- Stimuli: More challenging stimuli using ditransitive verbs and various structures (active/passive, direct/prepositional object, simple/cleft sentences) where simple syntactic heuristics don't suffice to determine thematic roles. Pairs had either shared or opposite thematic roles.
- Hidden Units: Trained linear Support Vector Machine (SVM) classifiers on the difference (or concatenation) between the hidden unit activations of sentence pairs to predict whether they shared or had opposite thematic roles. Accuracy was tested against chance (0.5) across different layers and compared to human "implicit classification accuracy" derived from similarity judgments.
- Attention Heads: For BERT (due to its bidirectional attention), SVM classifiers were trained on attention weights between content words (subject, object, verb) to predict shared vs. opposite thematic roles. High-performing heads were then analyzed to see how they attended to different thematic roles.
Key Findings:
- Hidden Units: While linear classifiers could often perform significantly above chance (0.5) in predicting thematic roles from hidden unit activations, the accuracy was generally low (mostly < 0.6). The best-performing case in GPT2 layer 5 achieved an accuracy around 0.6 - 0.65, which was still below the human benchmark of 0.703. This suggests thematic role information is present but not strongly or easily linearly accessible in the aggregated hidden state.
- Attention Heads: In BERT, many attention heads showed high accuracy in classifying thematic roles, with some exceeding 0.79 accuracy, surpassing the human benchmark. Analysis of the highest-performing head (layer 11, head 5) revealed that it consistently assigned more attention from the verb and direct object towards the agent, and from the patient towards the agent, regardless of their grammatical position in the sentence.
Practical Implications:
- Although the overall sentence representation (pooled hidden state) is syntax-dominant, thematic role information is encoded internally within the model, particularly within the attention mechanisms.
- If your application requires precise thematic role identification, analyzing or utilizing the outputs of specific attention heads might be a more effective strategy than relying on the final sentence embedding. This suggests potential architectures for semantic role labeling or relation extraction that could explicitly leverage attention weights from models like BERT.
- The finding that attention heads prioritize connections towards the agent ("who did it") suggests a potential internal representation related to causality or action initiation, which could be valuable for applications analyzing events.
- Implementing probes or extracting attention patterns requires deeper access to the model's internal structure than just using its final output embeddings or text generation capabilities. This would involve using libraries like Hugging Face Transformers to access intermediate layer outputs and attention weights.

Overall Implementation Takeaways:

Standard embeddings (e.g., [CLS], mean pooling): Be cautious using these for tasks heavily reliant on distinguishing agent/patient, especially across different syntactic structures (active/passive). Their similarity seems to reflect syntax more than semantic roles.
Accessing Thematic Roles: Thematic role information is available, but potentially resides more strongly in attention mechanisms. If your task requires robust thematic role understanding, consider:
- Training specific probing classifiers (like SVMs) on the hidden states or attention weights for your task, rather than assuming the default embeddings capture this well.
- Potentially designing architectures that explicitly utilize attention patterns, or fine-tuning on Semantic Role Labeling (SRL) data (though the paper focuses on models without such fine-tuning).
- Focusing analysis or extraction on content word interactions (verb-noun, noun-noun) within the attention mechanisms, especially in bidirectional models like BERT.
Model Choice: The findings were broadly consistent across different base models tested, suggesting this is a general characteristic of word-prediction trained transformers. However, model size and architecture details might influence the degree to which this information is present and accessible. Models with different training objectives (e.g., those fine-tuned with RLHF) might exhibit different behavior.
Complexity: The discrepancy between syntax and thematic roles seems to persist even with more complex sentence structures, though the signal becomes weaker in the overall hidden states.
Probing vs. Prompting: The paper advocates for probing internal states for fundamental linguistic capabilities, arguing that prompting might mask true competence due to reliance on task-specific knowledge or heuristics. This suggests that while an LLM might answer a "who is the agent?" prompt correctly, its internal representation of the sentence might still be syntax-dominant.

In essence, while pre-trained LLMs trained on word prediction learn complex linguistic patterns including syntax, their inherent representation of semantic roles like agent and patient, as reflected in their overall sentence embeddings, is weaker and less human-like than their representation of syntactic structure. However, the capacity to extract this semantic role information exists within the network, particularly visible in how attention mechanisms link words within a sentence. Accessing this specific type of meaning might require methods that look beyond the standard aggregated sentence vector.