Extent to which LLMs embody a theory for counterfactual and latent causal reasoning

Ascertain the extent to which large language models contain a theory capable of considering counterfactuals and latent causal factors in real-world settings.

Background

The paper evaluates whether LLMs can act as repositories of causal knowledge by recalling expert-identified confounders in the Coronary Drug Project rather than deriving them through causal reasoning. The authors note that causal inference requires modeling counterfactuals and latent causal structures, whereas LLMs are trained to predict observed text, raising doubts about their ability to handle counterfactual reasoning.

This uncertainty motivates the paper’s focus on recall-based performance in a well-studied setting (the CDP) where expert confounders are documented and likely in LLM training data, allowing the authors to test whether LLMs can reproduce expert causal assertions even without possessing a robust causal theory.

References

Properly performing causal inference relies on theoretical understanding of real-world counterfactuals, while LLMs are instead trained to replicate actual observed text, and it is unclear the extent to which these models contain anything like a theory capable of considering counterfactuals or latent causal factors in real-world settings.

Do LLMs Act as Repositories of Causal Knowledge? (2412.10635 - Huntington-Klein et al., 14 Dec 2024) in Introduction