Coordinated Multi-Neighborhood Learning on a Directed Acyclic Graph (2405.15358v1)
Abstract: Learning the structure of causal directed acyclic graphs (DAGs) is useful in many areas of machine learning and artificial intelligence, with wide applications. However, in the high-dimensional setting, it is challenging to obtain good empirical and theoretical results without strong and often restrictive assumptions. Additionally, it is questionable whether all of the variables purported to be included in the network are observable. It is of interest then to restrict consideration to a subset of the variables for relevant and reliable inferences. In fact, researchers in various disciplines can usually select a set of target nodes in the network for causal discovery. This paper develops a new constraint-based method for estimating the local structure around multiple user-specified target nodes, enabling coordination in structure learning between neighborhoods. Our method facilitates causal discovery without learning the entire DAG structure. We establish consistency results for our algorithm with respect to the local neighborhood structure of the target nodes in the true graph. Experimental results on synthetic and real-world data show that our algorithm is more accurate in learning the neighborhood structures with much less computational cost than standard methods that estimate the entire DAG. An R package implementing our methods may be accessed at https://github.com/stephenvsmith/CML.
- C. Heinze-Deml, M. H. Maathuis, and N. Meinshausen, “Causal structure learning,” Annual Review of Statistics and Its Application, vol. 5, no. 1, pp. 371–391, 2018. [Online]. Available: https://doi.org/10.1146/annurev-statistics-031017-100630
- M. J. Vowels, N. C. Camgoz, and R. Bowden, “D’ya like DAGs? a survey on structure learning and causal discovery,” ACM Computing Surveys, vol. 55, no. 4, nov 2022. [Online]. Available: https://doi.org/10.1145/3527154
- J. Kaddour, A. Lynch, Q. Liu, M. J. Kusner, and R. Silva, “Causal machine learning: A survey and open problems,” 2022. [Online]. Available: http://arxiv.org/abs/2206.15475
- J. Gu and Q. Zhou, “Learning big Gaussian Bayesian networks: Partition, estimation and fusion,” Journal of Machine Learning Research, vol. 21, no. 158, pp. 1–31, 2020. [Online]. Available: http://jmlr.org/papers/v21/19-318.html
- N. Friedman, M. Linial, I. Nachman, and D. Pe’er, “Using Bayesian networks to analyze expression data,” Journal of Computational Biology, vol. 7, no. 3-4, pp. 601–620, 2000.
- A. Belyaeva, C. Squires, and C. Uhler, “DCI: learning causal differences between gene regulatory networks,” Bioinformatics, vol. 37, no. 18, pp. 3067–3069, 03 2021. [Online]. Available: https://doi.org/10.1093/bioinformatics/btab167
- P. Spirtes, C. Glymour, R. Scheines, S. Kauffman, V. Aimale, and F. Wimberly, “Constructing Bayesian Network Models of Gene Expression Networks from Microarray Data,” in Proceedings of the Atlantic Symposium on Computational Biology, Genome Information Systems and Technology, 2000. [Online]. Available: https://kilthub.cmu.edu/articles/journal_contribution/Constructing_Bayesian_Network_Models_of_Gene_Expression_Networks_from_Microarray_Data/6491291
- M. H. Maathuis, M. Kalisch, and P. Bühlmann, “Estimating high-dimensional intervention effects from observational data,” The Annals of Statistics, vol. 37, no. 6A, pp. 3133 – 3164, 2009. [Online]. Available: https://doi.org/10.1214/09-AOS685
- P. Nandy, M. H. Maathuis, and T. S. Richardson, “Estimating the effect of joint interventions from observational data in sparse high-dimensional settings,” The Annals of Statistics, vol. 45, no. 2, pp. 647 – 674, 2017. [Online]. Available: https://doi.org/10.1214/16-AOS1462
- C. F. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, and X. D. Koutsoukos, “Local causal and Markov blanket induction for causal discovery and feature selection for classification part i: Algorithms and empirical evaluation,” Journal of Machine Learning Research, vol. 11, no. 7, pp. 171–234, 2010. [Online]. Available: http://jmlr.org/papers/v11/aliferis10a.html
- ——, “Local causal and Markov blanket induction for causal discovery and feature selection for classification part ii: Analysis and extensions,” Journal of Machine Learning Research, vol. 11, p. 235–284, mar 2010.
- T. Gao and Q. Ji, “Efficient Markov blanket discovery and its application,” IEEE Transactions on Cybernetics, vol. 47, no. 5, pp. 1169–1179, 2017.
- S. Fu and M. Desmarais, “Markov blanket based feature selection: A review of past decade,” Lecture Notes in Engineering and Computer Science, vol. 2183, 06 2010.
- W. Khan, L. Kong, S. M. Noman, and B. Brekhna, “A novel feature selection method via mining Markov blanket,” Applied Intelligence, vol. 53, no. 7, pp. 8232–8255, Apr 2023. [Online]. Available: https://doi.org/10.1007/s10489-022-03863-z
- K. Yu, L. Liu, and J. Li, “Learning Markov blankets from multiple interventional data sets,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 6, pp. 2005–2019, 2020.
- I. Tsamardinos and C. F. Aliferis, “Towards principled feature selection: Relevancy, filters and wrappers,” in Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, C. M. Bishop and B. J. Frey, Eds., vol. R4. PMLR, 03–06 Jan 2003, pp. 300–307, reissued by PMLR on 01 April 2021. [Online]. Available: https://proceedings.mlr.press/r4/tsamardinos03a.html
- D. Koller and M. Sahami, “Toward optimal feature selection,” in Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, ser. ICML’96. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1996, p. 284–292.
- J. Pellet and A. Elisseeff, “Using Markov blankets for causal structure learning,” Journal of Machine Learning Research, vol. 9, p. 1295–1342, jun 2008.
- Z. Fang, Y. Liu, Z. Geng, S. Zhu, and Y. He, “A local method for identifying causal relations under Markov equivalence,” Artificial Intelligence, vol. 305, p. 103669, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0004370222000091
- D. Margaritis and S. Thrun, “Bayesian network induction via local neighborhoods,” in Advances in Neural Information Processing Systems, S. Solla, T. Leen, and K. Müller, Eds., vol. 12. MIT Press, 1999. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/1999/file/5d79099fcdf499f12b79770834c0164a-Paper.pdf
- I. Tsamardinos, C. F. Aliferis, and A. Statnikov, “Algorithms for large scale Markov blanket discovery.” in FLAIRS Conference, I. Russell and S. M. Haller, Eds. AAAI Press, 2003, pp. 376–381. [Online]. Available: http://dblp.uni-trier.de/db/conf/flairs/flairs2003.html#TsamardinosAS03
- C. F. Aliferis, I. Tsamardinos, and A. Statnikov, “Hiton: A novel Markov blanket algorithm for optimal variable selection,” AMIA … Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, vol. 2003, pp. 21–5, 02 2003.
- T. Niinimäki and P. Parviainen, “Local structure discovery in Bayesian networks,” in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, ser. UAI’12. Arlington, Virginia, USA: AUAI Press, 2012, p. 634–643.
- G. Borboudakis and I. Tsamardinos, “Forward-backward selection with early dropping,” Journal of Machine Learning Research, vol. 20, no. 1, p. 276–314, jan 2019.
- X. Guo, K. Yu, F. Cao, P. Li, and H. Wang, “Error-aware Markov blanket learning for causal feature selection,” Information Sciences, vol. 589, pp. 849–877, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0020025521013402
- T. Richardson and P. Spirtes, “Ancestral graph Markov models,” The Annals of Statistics, vol. 30, no. 4, pp. 962 – 1030, 2002. [Online]. Available: https://doi.org/10.1214/aos/1031689015
- J. Zhang, “Causal reasoning with ancestral graphs,” Journal of Machine Learning Research, vol. 9, 07 2008.
- ——, “On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias,” Artificial Intelligence, vol. 172, no. 16, pp. 1873–1896, 2008. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0004370208001008
- C. Meek, “Causal inference and causal explanation with background knowledge,” in Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, ser. UAI’95. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1995, p. 403–410.
- D. Colombo, M. H. Maathuis, M. Kalisch, and T. S. Richardson, “Learning high-dimensional directed acyclic graphs with latent and selection variables,” The Annals of Statistics, vol. 40, no. 1, pp. 294 – 321, 2012. [Online]. Available: https://doi.org/10.1214/11-AOS940
- W. Chen, M. Drton, and A. Shojaie, “Causal structural learning via local graphs,” SIAM Journal on Mathematics of Data Science, vol. 5, no. 2, pp. 280–305, 2023. [Online]. Available: https://doi.org/10.1137/20M1362796
- P. Spirtes, “An anytime algorithm for causal inference,” in Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, T. S. Richardson and T. S. Jaakkola, Eds., vol. R3. PMLR, 04–07 Jan 2001, pp. 278–285, reissued by PMLR on 31 March 2021. [Online]. Available: https://proceedings.mlr.press/r3/spirtes01a.html
- M. Kalisch and P. Bühlmann, “Estimating high-dimensional directed acyclic graphs with the PC-algorithm,” Journal of Machine Learning Research, vol. 8, p. 613–636, may 2007.
- N. Meinshausen and P. Bühlmann, “High-dimensional graphs and variable selection with the lasso,” The Annals of Statistics, vol. 34, no. 3, pp. 1436 – 1462, 2006. [Online]. Available: https://doi.org/10.1214/009053606000000281
- M. Scutari, “Learning Bayesian networks with the bnlearn R package,” Journal of Statistical Software, vol. 35, no. 3, pp. 1–22, 2010. [Online]. Available: http://www.jstatsoft.org/v35/i03/
- I. Tsamardinos, C. F. Aliferis, and A. Statnikov, “Time and sample efficient discovery of Markov blankets and direct causal relations,” in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’03. New York, NY, USA: Association for Computing Machinery, 2003, p. 673–678. [Online]. Available: https://doi.org/10.1145/956750.956838
- M. Kalisch, M. Mächler, D. Colombo, M. H. Maathuis, and P. Bühlmann, “Causal inference using graphical models with the R package pcalg,” Journal of Statistical Software, vol. 47, no. 11, pp. 1–26, 2012.
- L.-F. Chu, N. Leng, J. Zhang, Z. Hou, D. Mamott, D. T. Vereide, J. Choi, C. Kendziorski, R. Stewart, and J. A. Thomson, “Single-cell RNA-seq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm,” Genome Biology, vol. 17, no. 1, p. 173, Aug. 2016.
- H. Li, O. H. M. Padilla, and Q. Zhou, “Learning Gaussian DAGs from network data,” 2021. [Online]. Available: https://arxiv.org/abs/1905.10848