Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration (2404.11922v1)
Abstract: Effective causal discovery is essential for learning the causal graph from observational data. The linear non-Gaussian acyclic model (LiNGAM) operates under the assumption of a linear data generating process with non-Gaussian noise in determining the causal graph. Its assumption of unmeasured confounders being absent, however, poses practical limitations. In response, empirical research has shown that the reformulation of LiNGAM as a shortest path problem (LiNGAM-SPP) addresses this limitation. Within LiNGAM-SPP, mutual information is chosen to serve as the measure of independence. A challenge is introduced - parameter tuning is now needed due to its reliance on kNN mutual information estimators. The paper proposes a threefold enhancement to the LiNGAM-SPP framework. First, the need for parameter tuning is eliminated by using the pairwise likelihood ratio in lieu of kNN-based mutual information. This substitution is validated on a general data generating process and benchmark real-world data sets, outperforming existing methods especially when given a larger set of features. The incorporation of prior knowledge is then enabled by a node-skipping strategy implemented on the graph representation of all causal orderings to eliminate violations based on the provided input of relative orderings. Flexibility relative to existing approaches is achieved. Last among the three enhancements is the utilization of the distribution of paths in the graph representation of all causal orderings. From this, crucial properties of the true causal graph such as the presence of unmeasured confounders and sparsity may be inferred. To some extent, the expected performance of the causal discovery algorithm may be predicted. The refinements above advance the practicality and performance of LiNGAM-SPP, showcasing the potential of graph-search-based methodologies in advancing causal discovery.
- Francis R. Bach and Michael I. Jordan. 2003. Kernel Independent Component Analysis. J. Mach. Learn. Res. 3, null (mar 2003), 1–48. https://doi.org/10.1162/153244303768966085
- Airfoil Self-Noise. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5VW2C.
- Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785–794. https://doi.org/10.1145/2939672.2939785 arXiv:1603.02754 [cs].
- Wine Quality. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C56S3T.
- CatBoost: gradient boosting with categorical features support. arXiv:1810.11363 [cs.LG]
- Yacht Hydrodynamics. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5XG7R.
- Review of Causal Discovery Methods Based on Graphical Models. Frontiers in Genetics 10 (2019). https://www.frontiersin.org/articles/10.3389/fgene.2019.00524
- A Kernel Statistical Test of Independence. In Advances in Neural Information Processing Systems, J. Platt, D. Koller, Y. Singer, and S. Roweis (Eds.), Vol. 20. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2007/file/d5cfead94f5350c12c322b5b664544c1-Paper.pdf
- Nonlinear causal discovery with additive noise models. In Advances in Neural Information Processing Systems, Vol. 21. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2008/hash/f7664060cc52bc6f3d620bcedc94a4b6-Abstract.html
- Aapo Hyvärinen. 1997. New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit. In Advances in Neural Information Processing Systems, M. Jordan, M. Kearns, and S. Solla (Eds.), Vol. 10. MIT Press. https://proceedings.neurips.cc/paper_files/paper/1997/file/6d9c547cf146054a5a720606a7694467-Paper.pdf
- A. Hyvärinen. 1998. New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit. WorkingPaper. MIT Press. 273–279 pages.
- Aapo Hyvärinen and Stephen M. Smith. 2013. Pairwise Likelihood Ratios for Estimation of Non-Gaussian Structural Equation Models. Journal of Machine Learning Research 14, 4 (2013), 111–152. http://jmlr.org/papers/v14/hyvarinen13a.html
- Python package for causal discovery based on LiNGAM. Journal of Machine Learning Research 24, 14 (2023), 1–8. http://jmlr.org/papers/v24/21-0321.html
- Markus Kalisch and Peter Bühlmann. 2007. Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm. Journal of Machine Learning Research 8, 22 (2007), 613–636. http://jmlr.org/papers/v8/kalisch07a.html
- Frontier-based Search for Enumerating All Constrained Subgraphs with Compressed Representation. IEICE Transactions on Fundamentals of Electronics, Communications, and Computer Sciences Vol. E100-A No. 9 (2017), 1773–1784.
- Estimating mutual information. Phys. Rev. E 69, 6 (June 2004), 066138. https://doi.org/10.1103/PhysRevE.69.066138 Publisher: American Physical Society.
- A Recursive Framework for Evaluating Moments Using Zero-Suppressed Binary Decision Diagrams. IEEE Access (Submitted).
- Jian Ma and Zengqi Sun. 2011. Mutual information is copula entropy. Tsinghua Science and Technology 16, 1 (2011), 51–54. https://doi.org/10.1016/S1007-0214(11)70008-6
- Shinichi Minato. 1993. Zero-Suppressed BDDs for Set Manipulation in Combinatorial Problems. Proceedings of the 30th International Design Automation Conference (1993), 272–277.
- Abalone. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C55C7W.
- Judea Pearl. 2000. Causality: models, reasoning, and inference. Cambridge University Press, Cambridge, U.K. ; New York.
- Causal inference in statistics: a primer. Wiley, Chichester, West Sussex.
- Judea Pearl and Dana Mackenzie. 2018. The book of why: the new science of cause and effect. Basic Books, New York.
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
- Elements of causal inference: foundations and learning algorithms. The MIT Press, Cambridge, Massachuestts.
- Donald B Rubin. 2005. Causal Inference Using Potential Outcomes: Design, Modeling, Decisions. J. Amer. Statist. Assoc. 100, 469 (March 2005), 322–331. https://doi.org/10.1198/016214504000001880
- Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data. Science 308, 5721 (2005), 523–529. https://doi.org/10.1126/science.1105809 arXiv:https://www.science.org/doi/pdf/10.1126/science.1105809
- Toward Causal Representation Learning. Proc. IEEE 109, 5 (May 2021), 612–634. https://doi.org/10.1109/JPROC.2021.3058954
- Shohei Shimizu. 2014. Lingam: Non-Gaussian Methods for Estimating Causal Structures. Behaviormetrika 41, 1 (Jan. 2014), 65–98. https://doi.org/10.2333/bhmk.41.65
- A Linear Non-Gaussian Acyclic Model for Causal Discovery. (2006), 28.
- DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model. (2011), 24.
- Learning the Structure of Linear Latent Variable Models. Journal of Machine Learning Research 7, 8 (2006), 191–246. http://jmlr.org/papers/v7/silva06a.html
- Peter Spirtes and Kun Zhang. 2016. Causal discovery and inference: concepts and recent methodological advances. Applied Informatics 3, 1 (Feb. 2016), 3. https://doi.org/10.1186/s40535-016-0018-x
- Joe Suzuki and Tianle Yang. 2022. An Genenalized LiNGAM when confunder is present. (2022).
- Joe Suzuki and Tian-Le Yang. 2024. Generalization of LiNGAM that allows confounding. arXiv:2401.16661 [cs.LG]
- ParceLiNGAM: A causal ordering method robust against latent confounders. arXiv:1303.7410 [stat.ML]
- ZDDs and Enumeration Problems: State-of-the-Art Techniques and Programming Tool. Computer Software Vol. 34 No. 3 (2017), 97–120.
- A Survey on Causal Inference. ACM Trans. Knowl. Discov. Data 15, 5, Article 74 (may 2021), 46 pages. https://doi.org/10.1145/3444944
- Hui Zou. 2006. The Adaptive Lasso and Its Oracle Properties. J. Amer. Statist. Assoc. 101, 476 (2006), 1418–1429. https://doi.org/10.1198/016214506000000735 arXiv:https://doi.org/10.1198/016214506000000735