Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Cyclic Causal Models from Incomplete Data (2402.15625v1)

Published 23 Feb 2024 in stat.ML, cs.AI, and cs.LG

Abstract: Causal learning is a fundamental problem in statistics and science, offering insights into predicting the effects of unseen treatments on a system. Despite recent advances in this topic, most existing causal discovery algorithms operate under two key assumptions: (i) the underlying graph is acyclic, and (ii) the available data is complete. These assumptions can be problematic as many real-world systems contain feedback loops (e.g., biological systems), and practical scenarios frequently involve missing data. In this work, we propose a novel framework, named MissNODAGS, for learning cyclic causal graphs from partially missing data. Under the additive noise model, MissNODAGS learns the causal graph by alternating between imputing the missing data and maximizing the expected log-likelihood of the visible part of the data in each training step, following the principles of the expectation-maximization (EM) framework. Through synthetic experiments and real-world single-cell perturbation data, we demonstrate improved performance when compared to using state-of-the-art imputation techniques followed by causal learning on partially missing interventional data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Structure learning for cyclic linear causal models. In Conference on Uncertainty in Artificial Intelligence, pages 999–1008. PMLR.
  2. Statistical guarantees for the EM algorithm: From population to sample-based analysis. The Annals of Statistics, 45(1):77 – 120.
  3. Invertible residual networks. In International Conference on Machine Learning, pages 573–582. PMLR.
  4. Bollen, K. A. (1989). Structural equations with latent variables, volume 210. John Wiley & Sons.
  5. Differentiable causal discovery from interventional data. Advances in Neural Information Processing Systems, 33:21865–21877.
  6. Carter, R. L. (2006). Solutions for missing data in structural equation modeling. Research & Practice in Assessment, 1:4–7.
  7. A penalized em algorithm incorporating missing data mechanism for gaussian parameter estimation. Biometrics, 70(2):312–322.
  8. Residual flows for invertible generative modeling. Advances in Neural Information Processing Systems, 32.
  9. Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens. Cell, 167(7):1853–1866.e17.
  10. Computation of maximum likelihood estimates in cyclic structural equation models. The Annals of Statistics, 47(2):663 – 690.
  11. Multimodal pooled Perturb-CITE-seq screens in patient models define mechanisms of cancer immune evasion. Nature genetics, 53(3):332–341.
  12. Systematic discovery and perturbation of regulatory genes in human T cells reveals the architecture of immune networks. Nature Genetics, pages 1–12.
  13. Friedman, N. (1998). The bayesian structural em algorithm. In Conference on Uncertainty in Artificial Intelligence.
  14. Structure learning under missing data. In International conference on probabilistic graphical models, pages 121–132. PMLR.
  15. Missdag: Causal discovery in the presence of missing data with continuous additive noise models. Advances in Neural Information Processing Systems, 35:5024–5038.
  16. Characterizing distribution equivalence and structure learning for cyclic and acyclic directed graphs. In International Conference on Machine Learning, pages 3494–3504. PMLR.
  17. Characterization and greedy learning of interventional markov equivalence classes of directed acyclic graphs. The Journal of Machine Learning Research, 13(1):2409–2464.
  18. Invariant causal prediction for nonlinear models. Journal of Causal Inference, 6(2).
  19. Estimation rates for sparse linear cyclic causal models. In Peters, J. and Sontag, D., editors, Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), volume 124 of Proceedings of Machine Learning Research, pages 1169–1178. PMLR.
  20. Hutchinson, M. F. (1989). A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Communications in Statistics-Simulation and Computation, 18(3):1059–1076.
  21. Learning linear cyclic causal models with latent variables. The Journal of Machine Learning Research, 13(1):3387–3439.
  22. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.
  23. Categorical reparameterization with Gumbel-Softmax. arXiv preprint arXiv:1611.01144.
  24. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  25. Miracle: Causally-aware imputation via learning missing data mechanisms. Advances in Neural Information Processing Systems, 34:23806–23817.
  26. Scaling structural learning with NO-BEARS to infer causal transcriptome networks. In Pacific Symposium on Biocomputing 2020, pages 391–402. World Scientific.
  27. Learning from incomplete data with generative adversarial networks. In International Conference on Learning Representations.
  28. Statistical analysis with missing data, volume 793. John Wiley & Sons.
  29. Multivariate time series imputation with generative adversarial networks. Advances in neural information processing systems, 31.
  30. Meek, C. (1997). Graphical Models: Selecting causal and statistical models. PhD thesis, Carnegie Mellon University.
  31. Cyclic causal discovery from continuous equilibrium data. In Uncertainty in Artificial Intelligence.
  32. Missing data imputation using optimal transport. In International Conference on Machine Learning, pages 7130–7140. PMLR.
  33. On the role of sparsity and DAG constraints for learning linear dags. Advances in Neural Information Processing Systems, 33:17943–17954.
  34. Masked gradient-based causal structure learning. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), pages 424–432. SIAM.
  35. Pearl, J. (2009). Causality. Cambridge University Press, 2 edition.
  36. Richardson, T. (1996). A discovery algorithm for directed cyclic graphs. In Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence, pages 454–461.
  37. Rudin, W. (1953). Principles of Mathematical Analysis. McGraw-Hill Book Company, Inc., New York-Toronto-London.
  38. Causal protein-signaling networks derived from multiparameter single-cell data. Science, 308(5721):523–529.
  39. Learning module networks. Journal of Machine Learning Research, 6(4).
  40. Nodags-flow: Nonlinear cyclic causal structure learning. In Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, volume 206 of Proceedings of Machine Learning Research, pages 6371–6387. PMLR.
  41. Singh, M. (1997). Learning bayesian networks from incomplete data. AAAI/IAAI, 1001:534–539.
  42. Consistency guarantees for permutation-based causal inference algorithms. arXiv preprint arXiv:1702.03530.
  43. Causation, prediction, and search. MIT press.
  44. Missforest—non-parametric missing value imputation for mixed-type data. Bioinformatics, 28(1):112–118.
  45. Fast causal inference with non-random missingness by test-wise deletion. International journal of data science and analytics, 6:47–62.
  46. Encoding dependence in bayesian causal networks. Frontiers in Environmental Science, 4:84.
  47. Constraint-based causal discovery from multiple interventions over overlapping variable sets. The Journal of Machine Learning Research, 16(1):2147–2205.
  48. The max-min hill-climbing bayesian network structure learning algorithm. Machine learning, 65(1):31–78.
  49. Causal discovery in the presence of missing data. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1762–1770. PMLR.
  50. Efficient algorithms for bayesian network parameter learning from incomplete data. In Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, UAI’15, page 161–170, Arlington, Virginia, USA. AUAI Press.
  51. Causal discovery from incomplete data: a deep learning approach. arXiv preprint arXiv:2001.05343.
  52. Permutation-based causal inference algorithms with interventions. Advances in Neural Information Processing Systems, 30.
  53. Multiple imputation using chained equations: issues and guidance for practice. Statistics in medicine, 30(4):377–399.
  54. Wu, C. F. J. (1983). On the Convergence Properties of the EM Algorithm. The Annals of Statistics, 11(1):95 – 103.
  55. DAG-GNN: DAG structure learning with graph neural networks. In International Conference on Machine Learning, pages 7154–7163. PMLR.
  56. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell, 153(3):707–720.
  57. DAGs with NO TEARS: Continuous optimization for structure learning. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 31.
  58. Learning sparse nonparametric DAGs. In Chiappa, S. and Calandra, R., editors, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108, pages 3414–3425.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets