A Survey on Causal Discovery: Theory and Practice (2305.10032v1)
Abstract: Understanding the laws that govern a phenomenon is the core of scientific progress. This is especially true when the goal is to model the interplay between different aspects in a causal fashion. Indeed, causal inference itself is specifically designed to quantify the underlying relationships that connect a cause to its effect. Causal discovery is a branch of the broader field of causality in which causal graphs is recovered from data (whenever possible), enabling the identification and estimation of causal effects. In this paper, we explore recent advancements in a unified manner, provide a consistent overview of existing algorithms developed under different settings, report useful tools and data, present real-world applications to understand why and how these methods can be fruitfully exploited.
- Exploring nonlinearity on the co2 emissions, economic production and energy use nexus: A causal discovery approach. Energy Reports, 7:6196–6204, 2021.
- Causalworld: A robotic manipulation benchmark for causal structure and transfer learning, 2020.
- H. Akaike. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716–723, 1974.
- Scaling up the greedy equivalence search algorithm by constraining the search space of equivalence classes. International journal of approximate reasoning, 54(4):429–451, 2013.
- A characterization of markov equivalence classes for acyclic digraphs. The Annals of Statistics, 25(2):505–541, 1997.
- Learning high-dimensional directed acyclic graphs with mixed data-types. In The 2019 ACM SIGKDD Workshop on Causal Discovery, pages 4–21. PMLR, 2019.
- Causal network modeling of the determinants of drinking behavior in comorbid alcohol use and anxiety disorder. Alcoholism: Clinical and Experimental Research, 43(1):91–97, 2019.
- On pearl’s hierarchy and the foundations of causal inference. 2021.
- William D Berry. Nonrecursive causal models. Number 37. Sage, 1984.
- Tuning causal discovery algorithms. In PGM, 2020.
- Foundations of structural causal models with cycles and latent variables, 2021.
- From random differential equations to structural causal models: the stochastic case, 2018.
- Differentiable causal discovery from interventional data, 2020.
- Cam: Causal additive models, high-dimensional order search and penalized regression. The Annals of Statistics, 42(6):2526–2556, 2014.
- Causal discovery from discrete data using hidden compact representation. Advances in neural information processing systems, 2018:2666, 2018.
- Expert systems and probabilistic network models. Springer Science & Business Media, 2012.
- David Maxwell Chickering. Optimal structure identification with greedy search. Journal of machine learning research, 3(Nov):507–554, 2002.
- Order-independent constraint-based causal structure learning, 2013.
- Learning high-dimensional directed acyclic graphs with latent and selection variables. The Annals of Statistics, 40(1), Feb 2012.
- Pierre Comon. Independent component analysis, a new concept? Signal processing, 36(3):287–314, 1994.
- Perturb-seq: dissecting molecular circuits with scalable single-cell rna profiling of pooled genetic screens. cell, 167(7):1853–1866, 2016.
- Iterative conditional fitting for gaussian ancestral graph models. arXiv preprint arXiv:1207.4118, 2012.
- On the number of experiments sufficient and in the worst case necessary to identify all causal relations among n variables, 2012.
- Constraint-based causal discovery for non-linear structural causal models with cycles and latent confounders. arXiv preprint arXiv:1807.03024, 2018.
- Markov properties for graphical models with cycles and latent variables, 2017.
- Learning gaussian networks. In Uncertainty Proceedings 1994, pages 235–243. Elsevier, 1994.
- Review of causal discovery methods based on graphical models. Frontiers in genetics, 10:524, 2019.
- Causal inference in statistics: A primer. John Wiley & Sons, 2016.
- A survey of learning causality with data: Problems and methods. ACM Computing Surveys, 53(4):1–37, Jul 2021.
- Trrust v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic acids research, 46(D1):D380–D386, 2018.
- Characterization and greedy learning of interventional markov equivalence classes of directed acyclic graphs. The Journal of Machine Learning Research, 13(1):2409–2464, 2012.
- MA Hernán and JM Robins. Causal Inference: What If. Boca Raton: Chapman & Hall/CRC, 2020.
- Jennifer L Hill. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1):217–240, 2011.
- Estimation of linear, non-gaussian causal models in the presence of confounding latent variables, 2006.
- Randomized experimental design for causal graph discovery. Advances in neural information processing systems, 27, 2014.
- Constraint-based causal discovery: Conflict resolution with answer set programming. In UAI, pages 340–349, 2014.
- A core-guided approach to learning optimal causal graphs. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017). International Joint Conferences on Artificial Intelligence, 2017.
- Guido W Imbens. Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Economics and statistics, 86(1):4–29, 2004.
- Discovery of causal models that contain latent variables through bayesian scoring of independence constraints. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 142–157. Springer, 2017.
- Causal discovery from soft interventions with unknown targets: Characterization and learning. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 9551–9561. Curran Associates, Inc., 2020.
- Causal inference using the algorithmic markov condition. IEEE Transactions on Information Theory, 56(10):5168–5194, 2010.
- Causal discovery toolbox: Uncover causal relationships in python, 2019.
- Causal inference using graphical models with the R package pcalg. Journal of Statistical Software, 47(11):1–26, 2012.
- Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell, 161(5):1187–1201, 2015.
- Characterization and learning of causal graphs with latent variables from soft interventions. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Experimental design for learning causal graphs with latent variables. In Nips, 2017.
- Probabilistic graphical models: principles and techniques. MIT press, 2009.
- Discovering cyclic causal models by independent components analysis. arXiv preprint arXiv:1206.3273, 2012.
- A fast pc algorithm for high dimensional causal discovery with multi-core pcs. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 16(5):1483–1495, Sep 2019.
- Generalized transportability: Synthesis of experiments from heterogeneous domains. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020.
- On nonparametric conditional independence tests for continuous variables. Wiley Interdisciplinary Reviews: Computational Statistics, 12(3):e1489, 2020.
- Stability approach to regularization selection (stars) for high dimensional graphical models, 2010.
- Ancestral causal inference, 2017.
- Generating realistic in silico gene networks for performance assessment of reverse engineering methods. Journal of computational biology, 16(2):229–239, 2009.
- Probabilistic soft interventions in conditional gaussian networks. In Robert G. Cowell and Zoubin Ghahramani, editors, Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, volume R5 of Proceedings of Machine Learning Research, pages 214–221. PMLR, 06–08 Jan 2005. Reissued by PMLR on 30 March 2021.
- Causal inference for process understanding in earth sciences. arXiv preprint arXiv:2105.00912, 2021.
- Christopher Meek. Graphical Models: Selecting causal and statistical models. PhD thesis, PhD thesis, Carnegie Mellon University, 1997.
- Christopher Meek. Causal inference and causal explanation with background knowledge. arXiv preprint arXiv:1302.4972, 2013.
- Causal pathways to social and occupational functioning in the first episode of schizophrenia: uncovering unmet treatment needs. Psychological Medicine, page 1–9, 2021.
- Constraint-based causal discovery using partial ancestral graphs in the presence of cycles. In Conference on Uncertainty in Artificial Intelligence, pages 1159–1168. PMLR, 2020.
- Joint causal inference from multiple contexts, 2020.
- Distinguishing cause from effect using observational data: methods and benchmarks. The Journal of Machine Learning Research, 17(1):1103–1204, 2016.
- Causal inference for time series analysis: Problems, methods and evaluation. Knowledge and Information Systems, pages 1–45, 2021.
- Identifiability of nonrecursive structural equation models. Statistics & Probability Letters, 122:109–117, 2017.
- High-dimensional consistency in score-based and hybrid structure learning, 2018.
- Causal discovery in machine learning: Theories and applications. Journal of Dynamics & Games, 8(3):203, 2021.
- Methods and tools for causal discovery and causal inference. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, page e1449, 2022.
- Cross-Disorder Group of the Psychiatric Genomics Consortium et al. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. The Lancet, 381(9875):1371–1379, 2013.
- A hybrid causal search algorithm for latent variable models. In Conference on Probabilistic Graphical Models, pages 368–379. PMLR, 2016.
- Judea Pearl. Causal diagrams for empirical research. Biometrika, 82(4):669–688, 1995.
- Judea Pearl. Theoretical impediments to machine learning with seven sparks from the causal revolution. arXiv preprint arXiv:1801.04016, 2018.
- The Book of Why: The New Science of Cause and Effect. Basic Books, Inc., USA, 1st edition, 2018.
- Structural intervention distance for evaluating causal graphs. Neural computation, 27(3):771–799, 2015.
- Elements of causal inference: foundations and learning algorithms. The MIT Press, 2017.
- A million variables and more: the fast greedy equivalence search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images. International journal of data science and analytics, 3(2):121–129, 2017.
- Tetrad—a toolbox for causal discovery. In 8th International Workshop on Climate Informatics, 2018.
- Discovering causal graphs with cycles and latent confounders: An exact branch-and-bound approach. International Journal of Approximate Reasoning, 117:29–49, 2020.
- Learning optimal cyclic causal graphs from interventional data. In International Conference on Probabilistic Graphical Models, pages 365–376. PMLR, 2020.
- Ancestral graph markov models. The Annals of Statistics, 30(4):962–1030, 2002.
- Thomas S Richardson. A discovery algorithm for directed cyclic graphs. arXiv preprint arXiv:1302.3599, 2013.
- Jorma Rissanen. Modeling by shortest data description. Automatica, 14(5):465–471, 1978.
- Backshift: Learning causal cyclic graphs from unknown shift interventions. arXiv preprint arXiv:1506.02494, 2015.
- From deterministic odes to dynamic structural causal models, 2018.
- Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721):523–529, 2005.
- Measurement error and causal discovery. In CEUR workshop proceedings, volume 1792, page 1. NIH Public Access, 2016.
- Toward causal representation learning. Proceedings of the IEEE, 109(5):612–634, 2021.
- Gideon Schwarz. Estimating the dimension of a model. The annals of statistics, pages 461–464, 1978.
- Marco Scutari. Learning bayesian networks with the bnlearn R package. Journal of Statistical Software, 35(3):1–22, 2010.
- Marco Scutari. An empirical-bayes score for discrete bayesian networks. In Conference on probabilistic graphical models, pages 438–448. PMLR, 2016.
- Marco Scutari. Bayesian network constraint-based structure learning algorithms: Parallel and optimized implementations in the bnlearn R package. Journal of Statistical Software, 77(2):1–20, 2017.
- Paralingam: Parallel causal structure learning for linear non-gaussian acyclic models. arXiv preprint arXiv:2109.13993, 2021.
- Paul Shannon. Dream4: Synthetic expression data for gene regulatory network inference from the 2009 dream4 challenge, 2021. R package version 1.30.0.
- Challenges and opportunities with causal discovery algorithms: application to alzheimer’s pathophysiology. Scientific reports, 10(1):1–12, 2020.
- Shohei Shimizu. Lingam: Non-gaussian methods for estimating causal structures. Behaviormetrika, 41(1):65–98, 2014.
- Recent Advances in Semi-Parametric Methods for Causal Discovery, chapter 5, pages 111–130. 2020.
- Complete identification methods for the causal hierarchy. Journal of Machine Learning Research, 9(9), 2008.
- Consistency guarantees for greedy permutation-based causal inference algorithms, 2021.
- Peter Spirtes. An anytime algorithm for causal inference. In International Workshop on Artificial Intelligence and Statistics, pages 278–285. PMLR, 2001.
- Causation, prediction, and search. MIT press, 2000.
- Causal discovery and inference: concepts and recent methodological advances. In Applied informatics, volume 3, pages 1–28. SpringerOpen, 2016.
- Peter L. Spirtes. Directed cyclic graphical representations of feedback models, 2013.
- Causal inference in the presence of latent variables and selection bias. 2013.
- Permutation-based causal structure learning with unknown intervention targets, 2020.
- Probabilistic latent variable models for distinguishing between cause and effect. Advances in neural information processing systems, 23:1687–1695, 2010.
- Distinguishing cause from effect using quantiles: Bivariate quantile causal discovery. In International Conference on Machine Learning, pages 9311–9323. PMLR, 2020.
- Causal discovery from changes, 2013.
- Constraint-based causal discovery with mixed data. International journal of data science and analytics, 6(1):19–30, 2018.
- Algorithms for large scale markov blanket discovery. In FLAIRS conference, volume 2, pages 376–380. St. Augustine, FL, 2003.
- The max-min hill-climbing bayesian network structure learning algorithm. Machine learning, 65(1):31–78, 2006.
- Syntren: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC bioinformatics, 7(1):1–12, 2006.
- Equivalence and synthesis of causal models. 1991.
- Characterizing and learning equivalence classes of causal dags under interventions. In International Conference on Machine Learning, pages 5541–5550. PMLR, 2018.
- Jiji Zhang. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence, 172(16-17):1873–1896, 2008.
- Dags with no tears: Continuous optimization for structure learning. arXiv preprint arXiv:1803.01422, 2018.