Effective Bayesian Causal Inference via Structural Marginalisation and Autoregressive Orders (2402.14781v2)
Abstract: Bayesian causal inference (BCI) naturally incorporates epistemic uncertainty about the true causal model into down-stream causal reasoning tasks by posterior averaging over causal models. However, this poses a tremendously hard computational problem due to the intractable number of causal structures to marginalise over. In this work, we decompose the structure learning problem into inferring (i) a causal order and (ii) a parent set for each variable given a causal order. By limiting the number of parents per variable, we can exactly marginalise over the parent sets in polynomial time, which leaves only the causal order to be marginalised. To this end, we propose a novel autoregressive model over causal orders (ARCO) learnable with gradient-based methods. Our method yields state-of-the-art in structure learning on simulated non-linear additive noise benchmarks with scale-free and Erdos-Renyi graph structures, and competitive results on real-world data. Moreover, we illustrate that our method accurately infers interventional distributions, which allows us to estimate posterior average causal effects and many other causal quantities of interest.
- Bayes{DAG}: Gradient-Based Posterior Inference for Causal Discovery. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Emergence of scaling in random networks. Science, 286, 1999.
- Differentiable Causal Discovery from Interventional Data. In H Larochelle, M Ranzato, R Hadsell, M F Balcan, and H Lin, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2020.
- Differentiable {DAG} Sampling. In International Conference on Learning Representations, 2022.
- BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021.
- James Cussens. Maximum likelihood pedigree reconstruction using integer programming. In WCB@ ICLP, pages 8–19, 2010.
- Efficient structure learning of bayesian networks using constraints. The Journal of Machine Learning Research, 12:663–689, 2011.
- P. Erdös and A. Rényi. On random graphs i. Publicationes Mathematicae Debrecen, 6:290, 1959.
- Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2018.
- Valid Inference after Causal Discovery. arXiv:2208.05949, 2022.
- David Heckerman. A bayesian approach to learning causal networks. In Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann, 1995.
- A Bayesian Approach to Causal Discovery. Computation, Causation, and Discovery, 1997.
- Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 2004.
- D. Koller and N. Friedman. Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine Learning, 2003.
- Stochastic Beams and Where To Find Them: The {G}umbel-Top-k Trick for Sampling Sequences Without Replacement. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97. PMLR, 2019. URL https://proceedings.mlr.press/v97/kool19a.html.
- Gradient-Based Neural DAG Learning. In International Conference on Learning Representations, 2020.
- DiBS: Differentiable Bayesian Structure Learning. Advances in Neural Information Processing Systems, 2021.
- Kevin P. Murphy. Probabilistic Machine Learning: An introduction. MIT Press, 2021.
- Kevin P. Murphy. Probabilistic Machine Learning: Advanced Topics. MIT Press, 2023.
- Structure Discovery in Bayesian Networks by Sampling Partial Orders. Journal of Machine Learning Research, 2016.
- OEIS Foundation Inc. Number of acyclic digraphs (or dags) with n labeled nodes, 2024. Entry A003024 in The On-Line Encyclopedia of Integer Sequences, https://oeis.org/A003024.
- Deep Structural Causal Models for Tractable Counterfactual Inference. In Advances in Neural Information Processing Systems, 2020.
- Judea Pearl. Causality. Cambridge University Press, 2009. ISBN 9780511803161.
- Exact maximum margin structure learning of bayesian networks. arXiv preprint arXiv:1206.6431, 2012.
- Constant-time predictive distributions for Gaussian processes. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 2018.
- Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy To Game. In M. Ranzato Vaughan, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2021.
- Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks. Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, UAI 2005, 2012.
- Active Bayesian Causal Inference. In S Koyejo, S Mohamed, A Agarwal, D Belgrave, K Cho, and A Oh, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2022.
- Towards Scalable Bayesian Learning of Causal DAGs. In H Larochelle, M Ranzato, R Hadsell, M F Balcan, and H Lin, editors, Advances in Neural Information Processing Systems. Curran Associates, Inc., 2020.
- DAG-GNN: DAG Structure Learning with Graph Neural Networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 2019.
- DAGs with NO TEARS: Continuous Optimization for Structure Learning. In S Bengio, H Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, and R Garnett, editors, Advances in Neural Information Processing Systems 31. Curran Associates, Inc., 2018.