Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Post-selection inference for causal effects after causal discovery (2405.06763v2)

Published 10 May 2024 in stat.ME

Abstract: Algorithms for constraint-based causal discovery select graphical causal models among a space of possible candidates (e.g., all directed acyclic graphs) by executing a sequence of conditional independence tests. These may be used to inform the estimation of causal effects (e.g., average treatment effects) when there is uncertainty about which covariates ought to be adjusted for, or which variables act as confounders versus mediators. However, naively using the data twice, for model selection and estimation, would lead to invalid confidence intervals. Moreover, if the selected graph is incorrect, the inferential claims may apply to a selected functional that is distinct from the actual causal effect. We propose an approach to post-selection inference that is based on a resampling and screening procedure, which essentially performs causal discovery multiple times with randomly varying intermediate test statistics. Then, an estimate of the target causal effect and corresponding confidence sets are constructed from a union of individual graph-based estimates and intervals. We show that this construction has asymptotically correct coverage for the true causal effect parameter. Importantly, the guarantee holds for a fixed population-level effect, not a data-dependent or selection-dependent quantity. Most of our exposition focuses on the PC-algorithm for learning directed acyclic graphs and the multivariate Gaussian case for simplicity, but the approach is general and modular, so it may be used with other conditional independence based discovery algorithms and distributional families.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Estimating high-dimensional intervention effects from observational data. The Annals of Statistics, 37(6A):3133 – 3164, 2009.
  2. Efficient adjustment sets in causal graphical models with hidden variables. Biometrika, 109(1):49–65, 2022.
  3. Causation, prediction, and search. MIT press, 2000.
  4. Order-independent constraint-based causal structure learning. Journal of Machine Learning Research, 15(1):3741–3782, 2014.
  5. Pc algorithm for nonparanormal graphical models. Journal of Machine Learning Research, 14(11):3365–3383, 2013.
  6. Nonlinear directed acyclic structure learning with weakly additive noise models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc., 2009. URL https://proceedings.neurips.cc/paper_files/paper/2009/file/83fa5a432ae55c253d0e60dbfa716723-Paper.pdf.
  7. Nonparametric causal structure learning in high dimensions. Entropy, 24(3):351, 2022.
  8. The reduced pc-algorithm: improved causal structure learning in large random networks. Journal of Machine Learning Research, 20(164):1–31, 2019.
  9. Judea Pearl. Causality. Cambridge university press, 2009.
  10. The costs and benefits of uniformly valid causal inference with high-dimensional nuisance parameters. Statistical Science, 38(1):1–12, 2023.
  11. Selective machine learning of doubly robust functionals. arXiv preprint arXiv:1911.02029, 2019.
  12. Confidence in causal discovery with linear causal models. In Conference on Uncertainty in Artificial Intelligence, 2021. URL https://api.semanticscholar.org/CorpusID:235390882.
  13. Confidence in causal inference under structure uncertainty in linear causal models with equal variances. Journal of Causal Inference, 11(1):20230030, 2023.
  14. Valid inference after causal discovery. arXiv preprint arXiv:2208.05949, 2022.
  15. Repro samples method for finite-and large-sample inferences. arXiv preprint arXiv:2206.06421, 2022.
  16. Robust inference for federated meta-learning. arXiv preprint arXiv:2301.00718, 2023.
  17. A practical guide to causal discovery with cohort data. arXiv preprint arXiv:2108.13395, 2021.
  18. Data-driven model building for life-course epidemiology. American Journal of Epidemiology, 190(9):1898–1907, 2021.
  19. Steffen L Lauritzen. Graphical models, volume 17. Clarendon Press, 1996.
  20. The hardness of conditional independence testing and the generalised covariance measure. Annals of Statistics, 48(3):1514, 2020.
  21. A flexible framework for nonparametric graphical modeling that accommodates machine learning. In International Conference on Machine Learning, pages 10442–10451, 2020.
  22. Testing conditional independence via quantile regression based partial copulas. Journal of Machine Learning Research, 22(70):1–47, 2021.
  23. A distribution free conditional independence test with applications to causal discovery. Journal of Machine Learning Research, 23(85):1–41, 2022.
  24. Christopher Meek. Causal inference and causal explanation with background knowledge. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pages 403–410, 1995.
  25. On efficient adjustment in causal graphs. Journal of Machine Learning Research, 21(246):1–45, 2020.
  26. Efficient adjustment sets for population average causal treatment effect estimation in graphical models. Journal of Machine Learning Research, 21(188):1–86, 2020.
  27. Graphical criteria for efficient total effect estimation via adjustment in causal linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2):579–599, 2022.
  28. Joint causal inference from multiple contexts. Journal of Machine Learning Research, 21(1):3919–4026, 2020.
  29. Janine Witte. tpc: Tiered PC Algorithm, 2023. URL https://CRAN.R-project.org/package=tpc. R package version 1.0.
  30. Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721):523–529, 2005.
  31. FASK with interventional knowledge recovers edges from the Sachs model. arXiv preprint arXiv:1805.03108, 2018.
  32. An automated approach to causal inference in discrete settings. Journal of the American Statistical Association, pages 1–16, 2023.
  33. A general method for deriving tight symbolic bounds on causal effects. Journal of Computational and Graphical Statistics, 32(2):567–576, 2023.
  34. Sharp symbolic nonparametric bounds for measures of benefit in observational and imperfect randomized studies with ordinal outcomes. Biometrika, page asae020, 2024.
  35. Learning directed acyclic graph models based on sparsest permutations. Stat, 7(1):e183, 2018.
  36. Consistency guarantees for greedy permutation-based causal inference algorithms. Biometrika, 108(4):795–814, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.