Use What You Know: Causal Foundation Models with Partial Graphs

Published 16 Feb 2026 in cs.LG | (2602.14972v1)

Abstract: Estimating causal quantities traditionally relies on bespoke estimators tailored to specific assumptions. Recently proposed Causal Foundation Models (CFMs) promise a more unified approach by amortising causal discovery and inference in a single step. However, in their current state, they do not allow for the incorporation of any domain knowledge, which can lead to suboptimal predictions. We bridge this gap by introducing methods to condition CFMs on causal information, such as the causal graph or more readily available ancestral information. When access to complete causal graph information is too strict a requirement, our approach also effectively leverages partial causal information. We systematically evaluate conditioning strategies and find that injecting learnable biases into the attention mechanism is the most effective method to utilise full and partial causal information. Our experiments show that this conditioning allows a general-purpose CFM to match the performance of specialised models trained on specific causal structures. Overall, our approach addresses a central hurdle on the path towards all-in-one causal foundation models: the capability to answer causal queries in a data-driven manner while effectively leveraging any amount of domain expertise.

Abstract PDF Upgrade to Chat

Authors (9)

Summary

The paper introduces a novel approach that integrates partial causal graphs into CFMs using learnable attention biases.
It demonstrates significant improvements in prediction accuracy and reduced uncertainty across various semi-synthetic benchmarks.
The methodology enables CFMs to leverage domain expertise for robust all-in-one causal inference, bridging bespoke and general-purpose models.

Overview of "Use What You Know: Causal Foundation Models with Partial Graphs" (2602.14972)

The paper "Use What You Know: Causal Foundation Models with Partial Graphs" (2602.14972) addresses the significant challenge of integrating domain knowledge into Causal Foundation Models (CFMs). Traditional estimators for causal inference have been specifically designed for particular assumptions, but CFMs offer a unified approach by amortizing causal discovery and inference in one step. However, current CFMs fail to incorporate domain knowledge, potentially leading to suboptimal predictions. This paper introduces methods to condition CFMs on partial causal information, improving their performance by injecting learnable biases into the attention mechanism.

Methodological Innovations

The authors propose a methodology for conditioning CFMs on available causal information like causal graphs or ancestral information. The proposed approach effectively leverages partial causal information when complete graph data is unavailable, narrowing down the set of plausible causal structures and thus enhancing prediction precision. This is visualized in Figure 1, where the model uses partial graph information to achieve a more concentrated posterior predictive distribution compared to using only observational data.

Figure 1: CFMs with partial graph information significantly reduce posterior uncertainty, enabling more precise predictions.

The paper explores multiple strategies for incorporating such information, with an emphasis on using learnable attention biases. Among the tested strategies, learnable attention biases were found to be most effective in both full and partial information scenarios, allowing CFMs to perform comparably with specialized models optimized for specific causal structures.

Numerical Results

The experimental results demonstrate a significant performance uplift when CFMs are conditioned with partial causal information. The results were robust across various semi-synthetic causal benchmarks and complex simulation environments, showing that CFMs with graph conditioning can match or exceed the performance of models tailored for particular causal structures.

Implications and Future Directions

This work resolves a crucial issue in the pathway to developing all-in-one CFMs: integrating domain expertise and observational data for answering causal queries. By effectively utilizing any available causal information, the proposed method makes CFMs more flexible and broadly applicable. The paper suggests that future research could focus on expanding the types of causal information utilized by CFMs, such as conditioning on aspects of functional mechanisms or noise distributions.

By tackling the challenge of conditioning with partial information, this research lays the groundwork for the next generation of CFMs. These models promise to provide robust all-in-one causal effect estimation, further blurring the lines between bespoke model design and general-purpose application in diverse causal inference scenarios.

Markdown Report Issue