- The paper introduces Do-PFN, a network that uses synthetic structural causal models to perform in-context learning for predicting interventional outcomes.
- It adapts transformer architectures to differentiate treatment from covariates, yielding superior estimates of conditional interventional distributions.
- Experiments demonstrate that Do-PFN outperforms baselines in CATE and CID estimation across both synthetic and hybrid real-world datasets.
Do-PFN: In-Context Learning for Causal Effect Estimation
The paper "Do-PFN: In-Context Learning for Causal Effect Estimation" introduces Do-PFN, a Prior-data Fitted Network designed to estimate causal effects from observational data by employing in-context learning. This method is trained on synthetic data deriving from structural causal models (SCMs), which helps Do-PFN to predict interventional outcomes without requiring explicit knowledge of causal graphs.
Methodology: Causal Inference with PFNs
Modeling Assumptions
Do-PFN assumes a prior distribution over SCMs, which allows it to simulate both observational and interventional data. The sampling of these datasets involves drawing noise and parameters from predefined distributions. The methodology is presented to address estimation challenges when traditional causal effect estimation assumptions (e.g., unconfoundedness) can't be guaranteed.
Architecture and Training Details
Do-PFN adapts the transformer architecture used in TabPFN with subtle modifications to its input representation, enabling it to differentiate between treatment and covariates. The training process involves optimizing the negative log-likelihood of the predicted interventional outcomes with observational data as context.
Figure 1: Do-PFN overview: Do-PFN conducts in-context learning (ICL) for causal effect estimation, predicting conditional interventional distributions (CIDs) based purely on observational data.
Experiments
Predicting Conditional Interventional Distributions (CIDs)
Do-PFN was tested against various baselines, including Random Forests and an alternate variant called Dont-PFN that predicts observational outcomes. Results illustrate that Do-PFN outperforms these methods across diverse synthetic case studies, especially in scenarios involving unidentifiable causal effects. Its ability to implicitly identify causal graph structures bolsters its interventional prediction capabilities across multiple scenarios.
Figure 2: Case studies: Graph structures in our six causal case studies necessitating automatic adjustments based on causal criteria.
Estimating Conditional Average Treatment Effects (CATEs)
In CATE estimation tasks, Do-PFN showed superior performance against meta-learners and double machine learning methods. This can be attributed to its balanced bias levels across interventions, which indirectly enhances the accuracy of its estimations. Unlike CID predictions, the bias cancels out in CATE calculations, leading to more precise results.


Figure 3: Results on synthetic data: Do-PFN's performance in estimating CIDs, CATEs, and ATEs across tasks.
Hybrid Synthetic-Real-World Data Evaluations
Do-PFN was evaluated on datasets such as Amazon Sales, demonstrating comparable performance to gold-standard DoWhy models that utilize ground-truth causal graphs. For both CID prediction and CATE estimation, Do-PFN maintained competitive performance, highlighting its robustness in practical applications.

Figure 4: Real-world case studies: Agreed-upon causal graphs for the Amazon Sales and Law School Admissions datasets.
Ablation Studies
The paper includes extensive ablation studies revealing Do-PFN's robustness across varying dataset sizes and graph complexities. It demonstrates that additional data can mitigate noise influences in the datasets, and shows consistent performance across complexities of graph structures.



Figure 5: In-distribution analysis: Ablation across base rates of ATE, dataset sizes, and graph complexities.
Discussion
Real-World Benchmarking
Generalization to real-world data necessitates careful design of the SCM prior, as discrepancies between modeled prior and real-world causal relationships can impact performance. Future checks and validations on this aspect are crucial to ensure robustness.
Trust and Interpretability
The methodology relies on presumed causal relationships via Do-PFN's learning mechanisms, emphasizing a need for interpretability studies to enhance trust in automated causal inference.
Extensions to Further Causal Tasks
Do-PFN's strong foundational framework suggests potential extensions to broader intervention scenarios and observational data types. Its current iteration forms the basis for exploring more advanced causal inference tasks.
Conclusion
Do-PFN emerges as a promising AI tool, effectively leveraging pre-trained networks for causal effect estimation from observational data. The paper concludes with optimism regarding Do-PFN's integration into standard machine learning practices, due to its outstanding ability to manage complex causal inference situations independently of traditional causal graph inputs.