Do LLMs Use Compositional Mechanisms to Solve Compositional Tasks?

Determine whether large language models produce compositional behavior by invoking compositional processing mechanisms—rather than non-compositional or heuristic mechanisms—when solving tasks that are compositionally structured.

Background

The paper examines how feedforward LLMs handle two-hop factual retrieval tasks that can be formalized as y = g(f(x)). Although such tasks are compositional by construction, prior work and the authors’ experiments reveal a persistent compositionality gap: success on each individual hop does not guarantee success on the composed mapping.

Using logit-lens analyses, the authors identify two distinct processing modes: a compositional mechanism that surfaces the intermediate variable f(x), and a direct mechanism that maps x to y without a detectable intermediate representation. They further report that the choice of mechanism correlates with linearity properties in the model’s embedding/unembedding spaces, suggesting representational geometry influences processing strategies.

Despite empirical insights and correlations, the overarching question remains unresolved at a general level: whether LLMs’ apparent compositional behaviors are underpinned by truly compositional mechanisms across tasks, models, and settings.

References

While LLMs appear to be increasingly capable of solving compositional tasks, it is an open question whether they do so using compositional mechanisms.

— How Do Language Models Compose Functions? (2510.01685 - Khandelwal et al., 2 Oct 2025) in Abstract

Do LLMs Use Compositional Mechanisms to Solve Compositional Tasks?

Background

References

Related Problems