Data-Driven Causal Effect Estimation Based on Graphical Causal Modelling: A Survey (2208.09590v2)
Abstract: In many fields of scientific research and real-world applications, unbiased estimation of causal effects from non-experimental data is crucial for understanding the mechanism underlying the data and for decision-making on effective responses or interventions. A great deal of research has been conducted to address this challenging problem from different angles. For estimating causal effect in observational data, assumptions such as Markov condition, faithfulness and causal sufficiency are always made. Under the assumptions, full knowledge such as, a set of covariates or an underlying causal graph, is typically required. A practical challenge is that in many applications, no such full knowledge or only some partial knowledge is available. In recent years, research has emerged to use search strategies based on graphical causal modelling to discover useful knowledge from data for causal effect estimation, with some mild assumptions, and has shown promise in tackling the practical challenge. In this survey, we review these data-driven methods on causal effect estimation for a single treatment with a single outcome of interest and focus on the challenges faced by data-driven causal effect estimation. We concisely summarise the basic concepts and theories that are essential for data-driven causal effect estimation using graphical causal modelling but are scattered around the literature. We identify and discuss the challenges faced by data-driven causal effect estimation and characterise the existing methods by their assumptions and the approaches to tackling the challenges. We analyse the strengths and limitations of the different types of methods and present an empirical evaluation to support the discussions. We hope this review will motivate more researchers to design better data-driven methods based on graphical causal modelling for the challenging problem of causal effect estimation.
- Alberto Abadie. 2003. Semiparametric instrumental variable estimation of treatment response models. Journal of Econometrics 113, 2 (2003), 231–263.
- Alberto Abadie and Guido W Imbens. 2016. Matching on the estimated propensity score. Econometrica 84, 2 (2016), 781–807.
- Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast Algorithms for Mining Association Rules in Large Databases. In The 20th International Conference on Very Large Data Bases. 487–499.
- Markov equivalence for ancestral graphs. The Annals of Statistics 37, 5B (2009), 2808–2837.
- Local causal and Markov blanket induction for causal discovery and feature selection for classification part i: Algorithms and empirical evaluation. Journal of Machine Learning Research 11, Jan (2010), 171–234.
- Mendelian randomization supports causality between maternal hyperglycemia and epigenetic regulation of leptin gene in newborns. Epigenetics 10, 4 (2015), 342–351.
- Joshua D Angrist and Guido W Imbens. 1995. Two-stage least squares estimation of average causal effects in models with variable treatment intensity. J. Amer. Statist. Assoc. 90, 430 (1995), 431–442.
- Identification of causal effects using instrumental variables. J. Amer. Statist. Assoc. 91, 434 (1996), 444–455.
- Manuel Arellano and Olympia Bover. 1995. Another look at the instrumental variable estimation of error-components models. Journal of Econometrics 68, 1 (1995), 29–51.
- Susan Athey and Guido Imbens. 2016. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences 113, 27 (2016), 7353–7360.
- Approximate residual balancing: debiased inference of average treatment effects in high dimensions. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80, 4 (2018), 597–623.
- Generalized random forests. The Annals of Statistics 47, 2 (2019), 1148–1178.
- Instrumental variable methods for causal inference. Statistics in Medicine 33, 13 (2014), 2297–2340.
- Heejung Bang and James M Robins. 2005. Doubly robust estimation in missing data and causal inference models. Biometrics 61, 4 (2005), 962–973.
- Recovering from selection bias in causal and statistical inference. In The Twenty-Eighth AAAI Conference on Artificial Intelligence. 2410–2416.
- Doubly robust nonparametric inference on the average treatment effect. Biometrika 104, 4 (2017), 863–880.
- Deep generalized method of moments for instrumental variable analysis. In Advances in neural information processing systems. 3564–3574.
- Rohit Bhattacharya and Razieh Nabi. 2022. On testability of the front-door model via verma constraints. In Uncertainty in Artificial Intelligence. PMLR, 202–212.
- Semiparametric Inference For Causal Effects In Graphical Models With Hidden Variables. Stat 1050 (2020), 27.
- Roger J Bowden and Darrell A Turkington. 1990. Instrumental Variables. Vol. 8. Cambridge university press.
- Carlos Brito and Judea Pearl. 2002. Generalized instrumental variables. In The Conference on Uncertainty in Artificial Intelligence. 85–93.
- Avoiding dynastic, assortative mating, and population stratification biases in Mendelian randomization through within-family analyses. Nature communications 11, 1 (2020), 3519.
- A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nature communications 11, 1 (2020), 376.
- David Card. 1993. Using Geographic Variation in College Proximity to Estimate the Return to Schooling. In Econometrica, Vol. 69. Citeseer, 1127–1160.
- Causal query in observational data with hidden variables. In 24th European Conference on Artificial Intelligence. 2551–2558.
- Ancestral Instrument Method for Causal Inference without Complete Knowledge. In International Joint Conference on Artificial Intelligence. 4843–4849.
- Sufficient dimension reduction for average causal effect estimation. Data Mining and Knowledge Discovery 36, 3 (2022), 1174–1196.
- Toward Unique and Unbiased Causal Effect Estimation From Data With Hidden Variables. IEEE Transactions on Neural Networks and Learning Systems 34, 11 (2022), 1–13.
- Discovering Ancestral Instrumental Variables for Causal Inference from Observational Data. IEEE Transactions on Neural Networks and Learning Systems (2023), 1–11.
- Local Search for Efficient Causal Effect Estimation. IEEE Transactions on Knowledge and Data Engineering 35, 9 (2023), 8823–8837. https://doi.org/10.1109/TKDE.2022.3218131
- David Maxwell Chickering. 1996. Learning Bayesian networks is NP-complete. In Learning from Data. Springer, 121–130.
- David Maxwell Chickering. 2002. Learning equivalence classes of Bayesian-network structures. Journal of Machine Learning Research 2, Feb (2002), 445–498.
- H Christopher Frey and Sumeet R Patil. 2002. Identification and review of sensitivity analysis methods. Risk analysis 22, 3 (2002), 553–578.
- Semi-instrumental variables: a test for instrument admissibility. In The Conference on Uncertainty in Artificial Intelligence. 83–90.
- Stephen R Cole and Miguel A Hernán. 2008. Constructing inverse probability weights for marginal structural models. American journal of epidemiology 168, 6 (2008), 656–664.
- Learning high-dimensional directed acyclic graphs with latent and selection variables. The Annals of Statistics 40, 1 (2012), 294–321.
- Smoking and lung cancer: recent evidence and a discussion of some questions. Journal of the National Cancer institute 22, 1 (1959), 173–203.
- Juan D Correa and Elias Bareinboim. 2017. Causal effect identification by adjustment under confounding and selection biases. In The Thirty-First AAAI Conference on Artificial Intelligence. 3740–3746.
- Covariate selection for the nonparametric estimation of an average treatment effect. Biometrika 98, 4 (2011), 861–875.
- Angus Deaton and Nancy Cartwright. 2018. Understanding and misunderstanding randomized controlled trials. Social Science & Medicine 210 (2018), 2–21.
- Peng Ding and Tyler J VanderWeele. 2016. Sensitivity analysis without assumptions. Epidemiology (Cambridge, Mass.) 27, 3 (2016), 368.
- An automated approach to causal inference in discrete settings. arXiv preprint arXiv:2109.13471 (2021).
- Data-driven covariate selection for nonparametric estimation of causal effects. In Artificial Intelligence and Statistics. 256–264.
- Robin J Evans and Thomas S Richardson. 2014. Markovian acyclic directed mixed graphs for discrete data. The Annals of Statistics 42, 4 (2014), 1452–1482.
- Zhuangyan Fang and Yangbo He. 2020. IDA with Background Knowledge. In Conference on Uncertainty in Artificial Intelligence. PMLR, 270–279.
- Doubly robust estimation of causal effects. American journal of epidemiology 173, 7 (2011), 761–767.
- Review of Causal Discovery Methods Based on Graphical Models. Frontiers in Genetics 10 (2019), 524–524.
- Clark N Glymour and Gregory Floyd Cooper. 1999. Computation, Causation, and Discovery. AAAI Press.
- Methodological challenges in causal research on racial and ethnic patterns of cognitive trajectories: measurement, selection, and bias. Neuropsychology review 18, 3 (2008), 194–213.
- Valid Inference after Causal Discovery. arXiv preprint arXiv:2208.05949 (2022).
- William H Greene. 2003. Econometric Analysis. Pearson Education India.
- Sander Greenland. 2003. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology 14, 3 (2003), 300–306.
- Causal diagrams for epidemiologic research. Epidemiology 10, 1 (1999), 37–48.
- Collider bias undermines our understanding of COVID-19 disease risk and severity. Nature communications 11, 1 (2020), 5749.
- A survey of learning causality with data: Problems and methods. ACM Computing Surveys (CSUR) 53, 4 (2020), 1–37.
- Jenny Häggström. 2018. Data-driven confounder selection via Markov and Bayesian networks. Biometrics 74, 2 (2018), 389–398.
- CovSel: An R package for covariate selection when estimating average causal effects. Journal of Statistical Software 68, 1 (2015), 1–20.
- Deep IV: A flexible approach for counterfactual prediction. In International Conference on Machine Learning. PMLR, 1414–1423.
- Valid causal inference with (some) invalid instruments. In International Conference on Machine Learning. PMLR, 4096–4106.
- James J Heckman. 2008. Econometric causality. International statistical review 76, 1 (2008), 1–27.
- Graphical criteria for efficient total effect estimation via adjustment in causal linear models. Journal of the Royal Statistical Society Series B 84, 2 (2022), 579–599.
- Miguel A Hernán and James M Robins. 2006. Instruments for causal inference: an epidemiologist’s dream? Epidemiology 17, 4 (2006), 360–372.
- Miguel A Hernán and James M Robins. 2020. Causal Inference, What If. CRC Boca Raton, FL;.
- Jennifer L Hill. 2011. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics 20, 1 (2011), 217–240.
- Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71, 4 (2003), 1161–1189.
- Constraint-based Causal Discovery: Conflict Resolution with Answer Set Programming.. In The Conference on Uncertainty in Artificial Intelligence. 340–349.
- Do-calculus when the True Graph Is Unknown.. In The Conference on Uncertainty in Artificial Intelligence. Citeseer, 395–404.
- Kosuke Imai and Marc Ratkovic. 2014. Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76, 1 (2014), 243–263.
- Guido W Imbens. 2014. Instrumental Variables: An Econometrician’s Perspective. Statist. Sci. 29, 3 (2014), 323–358.
- Guido W Imbens. 2020. Potential outcome and directed acyclic graph approaches to causality: Relevance for empirical practice in economics. Journal of Economic Literature 58, 4 (2020), 1129–79.
- Guido W Imbens and Donald B Rubin. 2015. Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press.
- Identification of conditional causal effects under markov equivalence. Advances in Neural Information Processing Systems 32 (2019).
- On causal identification under Markov equivalence. In 28th International Joint Conference on Artificial Intelligence, IJCAI 2019. International Joint Conferences on Artificial Intelligence, 6181–6185.
- Causal inference using graphical models with the R package pcalg. Journal of Statistical Software 47, 11 (2012), 1–26.
- Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization. J. Amer. Statist. Assoc. 111, 513 (2016), 132–144.
- Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press.
- Manabu Kuroki and Zhihong Cai. 2005. Instrumental variable tests for Directed Acyclic Graph Models.. In International Conference on Artificial Intelligence and Statistics. 190–197.
- Robert J LaLonde. 1986. Evaluating the econometric evaluations of training programs with experimental data. The American Economic Review 76, 4 (1986), 604–620.
- Inferring microRNA–mRNA causal regulatory relationships from expression data. Bioinformatics 29, 6 (2013), 765–771.
- Accurate data-driven prediction does not mean high reproducibility. Nature machine intelligence 2, 1 (2020), 13–15.
- Causal effect inference with deep latent-variable models. In The 31st International Conference on Neural Information Processing Systems. 6449–6459.
- Multi-cause effect estimation with disentangled confounder representation. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. 2790–2796.
- A generalized back-door criterion. The Annals of Statistics 43, 3 (2015), 1060–1088.
- Predicting causal effects in large-scale systems from observational data. Nature Methods 7, 4 (2010), 247–248.
- Estimating high-dimensional intervention effects from observational data. The Annals of Statistics 37, 6A (2009), 3133–3164.
- Daniel Malinsky and Peter Spirtes. 2017. Estimating bounds on causal effects in high-dimensional and possibly confounded systems. International Journal of Approximate Reasoning 88 (2017), 371–384.
- Instrumental variables: application and limitations. Epidemiology 17, 3 (2006), 260–267.
- Identification of causal effects in the presence of nonignorable missing outcome values. Biometrics 70, 2 (2014), 278–288.
- Christopher Meek. 1995. Causal inference and causal explanation with background knowledge. In the Eleventh Conference on Uncertainty in Artificial Intelligence. 403–411.
- S. L. Morgan and D. J. Harding. 2006. Matching Estimators of Causal Effects: Prospects and Pitfalls in Theory and Practice. Sociological Methods & Research 35, 1 (2006), 3–60.
- Stephen L Morgan and Christopher Winship. 2015. Counterfactuals and Causal Inference. Cambridge University Press.
- Semiparametric causal sufficient dimension reduction of multidimensional treatments. In Uncertainty in Artificial Intelligence. PMLR, 1445–1455.
- Richard E Neapolitan et al. 2004. Learning Bayesian Networks. Vol. 38. Pearson Prentice Hall Upper Saddle River, NJ.
- Methods and tools for causal discovery and causal inference. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 12, 2 (2022), e1449.
- Judea Pearl. 1995a. Causal diagrams for empirical research. Biometrika 82, 4 (1995), 669–688.
- Judea Pearl. 1995b. On the testability of causal models with latent and instrumental variables. In The Conference on Uncertainty in Artificial Intelligence. 435–443.
- Judea Pearl. 2009a. Causality: Models, Reasoning, and Inference. Cambridge university press.
- Judea Pearl. 2009b. Myth, confusion, and science in causal analysis. Tech. Rep. R-348 (2009). Los Angeles, CA: University of California.
- Judea Pearl et al. 2009. Causal inference in statistics: An overview. Statistics Surveys 3 (2009), 96–146.
- Judea Pearl and Dana Mackenzie. 2018. The Book of Why: The New Science of Cause and Effect. Basic Books.
- Jose M Peña. 2018. Reasoning with alternative acyclic directed mixed graphs. Behaviormetrika 45, 2 (2018), 389–422.
- Interpreting and using CPDAGs with background knowledge. In The Conference on Uncertainty in Artificial Intelligence. AUAI Press, ID–120.
- Complete graphical characterization and construction of adjustment sets in Markov equivalence classes of ancestral graphs. The Journal of Machine Learning Research 18, 1 (2018), 8132–8193.
- Elements of Causal Inference. The MIT Press.
- Graphical modeling of causal factors associated with the postoperative survival of esophageal cancer subjects. Medical Physics (2023), 1–10.
- Thomas Richardson. 2003. Markov properties for acyclic directed mixed graphs. Scandinavian Journal of Statistics 30, 1 (2003), 145–157.
- Thomas Richardson and Peter Spirtes. 2002. Ancestral graph Markov models. The Annals of Statistics 30, 4 (2002), 962–1030.
- Thomas S Richardson and Peter Spirtes. 2003. Causal inference via ancestral graph models. Oxford Statistical Science Series 27 (2003), 83–105.
- James Robins. 1986. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling 7, 9-12 (1986), 1393–1512.
- James M Robins. 1997. Causal inference from complex longitudinal data. In Latent variable modeling and applications to causality. Springer, 69–117.
- James M Robins and Sander Greenland. 1992. Identifiability and exchangeability for direct and indirect effects. Epidemiology 3, 2 (1992), 143–155.
- Marginal Structural Models and Causal Inference in Epidemiology. Epidemiology 11, 5 (2000), 551.
- Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. 116 (2000), 1–94.
- Paul R Rosenbaum and Donald B Rubin. 1985. The bias due to incomplete matching. Biometrics 41, 1 (1985), 103–116.
- Andrea Rotnitzky and Ezequiel Smucler. 2020. Efficient adjustment sets for population average causal treatment effect estimation in graphical models. The Journal of Machine Learning Research 21, 1 (2020), 7642–7727.
- Donald B Rubin. 1973. Matching to remove bias in observational studies. Biometrics 29 (1973), 159–183.
- Donald B Rubin. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66, 5 (1974), 688.
- Donald B Rubin. 2007. The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials. Statistics in Medicine 26, 1 (2007), 20–36.
- Jakob Runge. 2021. Necessary and sufficient graphical conditions for optimal adjustment sets in causal graphical models with hidden variables. Advances in Neural Information Processing Systems 34 (2021), 15762–15773.
- Inferring causation from time series in Earth system sciences. Nature communications 10, 1 (2019), 2553.
- Causal protein-signaling networks derived from multiparameter single-cell data. Science 308, 5721 (2005), 523–529.
- A review of covariate selection for non-experimental comparative effectiveness research. Pharmacoepidemiology and drug safety 22, 11 (2013), 1139–1145.
- Semiparametric sensitivity analysis: Unmeasured confounding in observational studies. arXiv preprint arXiv:2104.08300 (2021).
- Bernhard Schölkopf. 2022. Causality for machine learning. In Probabilistic and Causal Inference: The Works of Judea Pearl. 765–804.
- M Scutari. 2010. Learning Bayesian networks with the bnlearn R Package. Journal of Statistical Software 35, 3 (2010), 1–22.
- Jasjeet S Sekhon. 2011. Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching package for R. Journal of Statistical Software 42 (2011), 1–52.
- Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning. PMLR, 3076–3085.
- Ilya Shpitser and Judea Pearl. 2006. Identification of joint interventional distributions in recursive semi-Markovian causal models. In Proceedings of the National Conference on Artificial Intelligence, Vol. 21. 1219–1226.
- Ilya Shpitser and Judea Pearl. 2008. Complete identification methods for the causal hierarchy. Journal of Machine Learning Research 9, Sep (2008), 1941–1979.
- On the validity of covariate adjustment for estimating causal effects. In The Twenty-Sixth Conference on Uncertainty in Artificial Intelligence. 527–536.
- Identifying confounders using bayesian networks and estimating treatment effect in prostate cancer with observational data. JCO Clinical Cancer Informatics 7 (2023), e2200080.
- Mixed cumulative distribution networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 670–678.
- Ricardo Silva and Shohei Shimizu. 2017. Learning instrumental variables with structural and non-gaussianity assumptions. Journal of Machine Learning Research 18, 120 (2017), 1–49.
- Kernel instrumental variable regression. In International Conference on Neural Information Processing Systems. 4593–4605.
- Arvid Sjolander and Torben Martinussen. 2019. Instrumental variable estimation with the R package ivtools. Epidemiologic Methods 8, 1 (2019), 1–20.
- Michael E Sobel and Martin A Lindquist. 2020. Estimating causal effects in studies of human brain function: New models, methods and estimands. The annals of applied statistics 14, 1 (2020), 452.
- Peter Spirtes. 2010. Introduction to causal inference. Journal of Machine Learning Research 11, 5 (2010), 1643–1662.
- Causation, Prediction, and Search. MIT Press.
- Elizabeth A Stuart. 2010. Matching methods for causal inference: A review and a look forward. Statistical science: a review journal of the Institute of Mathematical Statistics 25, 1 (2010), 1–21.
- Identification of microenvironment related potential biomarkers of biochemical recurrence at 3 years after prostatectomy in prostate adenocarcinoma. Aging (Albany NY) 13, 12 (2021), 16024.
- Robust causal inference using directed acyclic graphs: the R package ‘dagitty’. International journal of epidemiology 45, 6 (2016), 1887–1894.
- Jin Tian and Judea Pearl. 2002. A general identification condition for causal effects. In Aaai/iaai. 567–573.
- Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58, 1 (1996), 267–288.
- Daniel A Tortorelli and Panagiotis Michaleris. 1994. Design sensitivity analysis: overview and review. Inverse problems in Engineering 1, 1 (1994), 71–105.
- The max-min hill-climbing Bayesian network structure learning algorithm. Machine learning 65, 1 (2006), 31–78.
- Constructing separators and adjustment sets in ancestral graphs. In The Thirtieth Conference on Uncertainty in Artificial Intelligence. 907–916.
- Efficiently finding conditional instruments for causal inference. (2015), 3243–3249.
- Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework. Artificial Intelligence 270 (2019), 1–40.
- Tyler J VanderWeele and Peng Ding. 2017. Sensitivity analysis in observational research: introducing the E-value. Annals of internal medicine 167, 4 (2017), 268–274.
- Tyler J VanderWeele and Ilya Shpitser. 2011. A new criterion for confounder selection. Biometrics 67, 4 (2011), 1406–1413.
- Marno Verbeek. 2008. A Guide to Modern Econometrics. John Wiley & Sons.
- D’ya like dags? a survey on structure learning and causal discovery. Comput. Surveys 55, 4 (2022), 1–36.
- Yixin Wang and David M Blei. 2019. The blessings of multiple causes. J. Amer. Statist. Assoc. 114, 528 (2019), 1574–1596.
- Janine Witte and Vanessa Didelez. 2019. Covariate selection strategies for causal inference: Classification and comparison. Biometrical Journal 61, 5 (2019), 1270–1289.
- On efficient adjustment in causal graphs. The Journal of Machine Learning Research 21, 1 (2020), 9956–10000.
- Instrumental Variables in Causal Inference and Machine Learning: A Survey. arXiv preprint arXiv:2212.05778 (2022).
- A survey on causal inference. ACM Transactions on Knowledge Discovery from Data (TKDD) 15, 5 (2021), 1–46.
- Representation Learning for Treatment Effect Estimation from Observational Data. In Advances in Neural Information Processing Systems. 2638–2648.
- A unified view of causal and non-causal feature selection. ACM Transactions on Knowledge Discovery from Data 15, 4 (2021), 1–46.
- Auto IV: Counterfactual Prediction via Automatic Instrumental Variable Decomposition. ACM Transactions on Knowledge Discovery from Data 16, 4 (2022), 1–20.
- Benito van der Zander and Maciej Liśkiewicz. 2016. Separators and adjustment sets in Markov equivalent DAGs. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. 3315–3321.
- Jiji Zhang. 2008a. Causal reasoning with ancestral graphs. Journal of Machine Learning Research 9, Jul (2008), 1437–1474.
- Jiji Zhang. 2008b. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 172, 16-17 (2008), 1873–1896.
- Predicting miRNA targets by integrating gene regulatory knowledge with expression profiles. PloS one 11, 4 (2016), e0152860.
- Autoregulation of microRNA biogenesis by let-7 and Argonaute. Nature 486, 7404 (2012), 541.