Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes (2402.18477v3)

Published 28 Feb 2024 in cs.LG, cs.AI, and stat.ML

Abstract: Inferring the causal structure underlying stochastic dynamical systems from observational data holds great promise in domains ranging from science and health to finance. Such processes can often be accurately modeled via stochastic differential equations (SDEs), which naturally imply causal relationships via "which variables enter the differential of which other variables". In this paper, we develop conditional independence (CI) constraints on coordinate processes over selected intervals that are Markov with respect to the acyclic dependence graph (allowing self-loops) induced by a general SDE model. We then provide a sound and complete causal discovery algorithm, capable of handling both fully and partially observed data, and uniquely recovering the underlying or induced ancestral graph by exploiting time directionality assuming a CI oracle. Finally, to make our algorithm practically usable, we also propose a flexible, consistent signature kernel-based CI test to infer these constraints from data. We extensively benchmark the CI test in isolation and as part of our causal discovery algorithms, outperforming existing approaches in SDE models and beyond.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (100)
  1. Sparsity in continuous-depth neural networks. Advances in Neural Information Processing Systems, 35:901–914.
  2. Beyond predictions in neural odes: Identification and interventions. arXiv preprint arXiv:2106.12430.
  3. Bayesdag: Gradient-based posterior sampling for causal discovery. arXiv preprint arXiv:2307.13917.
  4. Discovery of extended summary graphs in time series. In Uncertainty in Artificial Intelligence, pages 96–106. PMLR.
  5. Partial correlation and conditional correlation as measures of conditional independence. Australian & New Zealand Journal of Statistics, 46(4):657–664.
  6. Neural graphical modelling in continuous-time: consistency guarantees and algorithms. In International Conference on Learning Representations.
  7. The conditional permutation test for independence while controlling for confounders. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 82.
  8. Causal modeling of dynamical systems. arXiv preprint arXiv:1803.08784.
  9. Foundations of structural causal models with cycles and latent variables. The Annals of Statistics, 49(5):2885 – 2915.
  10. From random differential equations to structural causal models: The stochastic case. arXiv preprint arXiv:1803.08784.
  11. Differentiable causal discovery from interventional data. Advances in Neural Information Processing Systems, 33:21865–21877.
  12. Weighted signature kernels. Annals of Applied Probability.
  13. Differentiable DAG sampling. In International Conference on Learning Representations.
  14. A review and roadmap of deep learning causal discovery in different variable paradigms. arXiv preprint arXiv:2209.06367.
  15. Signature moments to characterize laws of stochastic processes. J. Mach. Learn. Res., 23(1).
  16. Nonparametric conditional local independence testing. The Annals of Statistics, 51(5):2116 – 2144.
  17. Neural signature kernels as infinite-width-depth-limits of controlled resnets. In Proceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org.
  18. Sk-tree: a systematic malware detection algorithm on streaming trees via the signature kernel. In 2021 IEEE international conference on cyber security and resilience (CSR), pages 35–40. IEEE.
  19. Daudin, J. (1980). Partial association measures and an application to qualitative regression. Biometrika, 67(3):581–590.
  20. Didelez, V. (2008). Graphical models for marked point processes based on local independence. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1):245–264.
  21. A new statistic and practical guidelines for nonparametric granger causality testing. Journal of Economic Dynamics and Control, 30(9-10):1647–1669.
  22. A permutation-based kernel conditional independence test. In UAI, pages 132–141.
  23. Eichler, M. (2007). Granger causality and path diagrams for multivariate time series. Journal of Econometrics, 137(2):334–353.
  24. On granger causality and the effect of interventions in time series. Lifetime data analysis, 16:3–32.
  25. On causal discovery from time series data using fci. Probabilistic graphical models, pages 121–128.
  26. Evans, L. C. (2006). An introduction to stochastic differential equations version 1.2. Lecture Notes, UC Berkeley.
  27. Noncausality in continuous time. Econometrica: Journal of the Econometric Society, pages 1195–1212.
  28. A general definition of influence between stochastic processes. Lifetime data analysis, 16(1):33–44.
  29. High-recall causal discovery for autocorrelated time series with latent confounders. Advances in Neural Information Processing Systems, 33:12615–12625.
  30. Iterated integrals and population time series analysis. In Baas, N. A., Carlsson, G. E., Quick, G., Szymik, M., and Thaule, M., editors, Topological Data Analysis, pages 219–246, Cham. Springer International Publishing.
  31. Path signature area-based causal discovery in coupled time series. In Ma, S. and Kummerfeld, E., editors, Proceedings of The 2021 Causal Analysis Workshop Series, volume 160 of Proceedings of Machine Learning Research, pages 21–38. PMLR.
  32. Review of causal discovery methods based on graphical models. Frontiers in genetics, 10:524.
  33. Granger, C. W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica: journal of the Econometric Society, pages 424–438.
  34. Granger, C. W. (1980). Testing for causality: A personal viewpoint. Journal of Economic Dynamics and control, 2:329–352.
  35. A kernel method for the two-sample-problem. Advances in neural information processing systems, 19.
  36. A kernel statistical test of independence. Advances in neural information processing systems, 20.
  37. Bacadi: Bayesian causal discovery with unknown interventions. In International Conference on Artificial Intelligence and Statistics, pages 1411–1436. PMLR.
  38. A survey on causal discovery methods for iid and time series data. Transactions on Machine Learning Research.
  39. Nonlinear causal discovery with additive noise models. Advances in neural information processing systems, 21.
  40. Non-adversarial training of neural sdes with signature kernel scores. Advances in Neural Information Processing Systems.
  41. Quantifying causal influences. THE ANNALS of STATISTICS, pages 2324–2358.
  42. Javier, P. J. E. (2021). causal-ccm a Python implementation of Convergent Cross Mapping.
  43. Learning neural causal models from unknown interventions.
  44. Learning to induce causal structure. In International Conference on Learning Representations.
  45. Local permutation tests for conditional independence. The Annals of Statistics, 50(6):3388–3414.
  46. Kernels for sequentially ordered data. Journal of Machine Learning Research, 20.
  47. Kernel-based independence tests for causal structure learning on functional data. Entropy, 25(12):1597.
  48. Unifying Markov properties for graphical models. The Annals of Statistics, 46(5):2251 – 2278.
  49. Independence properties of directed markov fields. Networks, 20(5):491–505.
  50. Data generating process to evaluate causal discovery techniques for time series data. arXiv preprint arXiv:2104.08043.
  51. Self-discrepancy conditional independence test. In Uncertainty in artificial intelligence, volume 33.
  52. Distribution regression for sequential data. In International Conference on Artificial Intelligence and Statistics, pages 3754–3762. PMLR.
  53. On nonparametric conditional independence tests for continuous variables. Wiley Interdisciplinary Reviews: Computational Statistics, 12(3):e1489.
  54. Dibs: Differentiable bayesian structure learning. Advances in Neural Information Processing Systems, 34:24111–24123.
  55. Conditional independence testing in hilbert spaces with applications to functional data analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(5):1821–1850.
  56. Causal structure learning from multivariate time series in settings with unmeasured confounding. In Proceedings of 2018 ACM SIGKDD workshop on causal discovery, pages 23–47. PMLR.
  57. Kernel method for nonlinear granger causality. Physical review letters, 100(14):144103.
  58. Meek, C. (2014). Toward learning graphical and causal process models. In CI@ UAI, pages 43–48.
  59. Mogensen, S. W. (2023). Weak equivalence of local independence graphs. arXiv preprint arXiv:2302.12541.
  60. Markov equivalence of marginalized local independence graphs. The Annals of Statistics, 48(1):539 – 559.
  61. Graphical modeling of stochastic processes driven by correlated noise. Bernoulli, 28(4):3023 – 3050.
  62. Causal learning for partially observed stochastic dynamical systems. In Conference on Uncertainty in Artificial Intelligence,, pages 350–360.
  63. Assumption violations in causal discovery and the robustness of score matching. In Thirty-seventh Conference on Neural Information Processing Systems.
  64. Scalable causal discovery with score matching. In van der Schaar, M., Zhang, C., and Janzing, D., editors, Proceedings of the Second Conference on Causal Learning and Reasoning, volume 213 of Proceedings of Machine Learning Research, pages 752–771. PMLR.
  65. From ordinary differential equations to structural causal models: the deterministic case. In Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, UAI’13, page 440–448, Arlington, Virginia, USA. AUAI Press.
  66. Kernel mean embedding of distributions: A review and beyond. Foundations and Trends® in Machine Learning, 10(1-2):1–141.
  67. Dynotears: Structure learning from time-series data. In International Conference on Artificial Intelligence and Statistics, pages 1595–1605. PMLR.
  68. A measure-theoretic approach to kernel conditional mean embeddings. Advances in neural information processing systems, 33:21247–21259.
  69. Pearl, J. (2009). Causality. Cambridge university press.
  70. Causal models for dynamical systems. In Probabilistic and Causal Inference: The Works of Judea Pearl, pages 671–690. Association for Computing Machinery.
  71. Elements of causal inference: foundations and learning algorithms. The MIT Press.
  72. Diffusions, Markov processes, and martingales. Vol. 2. Cambridge Mathematical Library. Cambridge University Press, Cambridge. Itô calculus, Reprint of the second (1994) edition.
  73. Score matching enables causal discovery of nonlinear additive noise models. In International Conference on Machine Learning, pages 18741–18753. PMLR.
  74. Runge, J. (2018). Causal network reconstruction from time series: From theoretical assumptions to practical estimation. Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(7):075310.
  75. Runge, J. (2020). Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets. In Conference on Uncertainty in Artificial Intelligence, pages 1388–1397. PMLR.
  76. Causal inference for time series. Nature Reviews Earth & Environment, 4(7):487–505.
  77. Detecting and quantifying causal associations in large nonlinear time series datasets. Science advances, 5(11):eaau4996.
  78. The causality for climate competition. In NeurIPS 2019 Competition and Demonstration Track, pages 110–120. PMLR.
  79. The signature kernel is the solution of a Goursat PDE. SIAM Journal on Mathematics of Data Science, 3(3):873–899.
  80. Siggpde: Scaling sparse gaussian processes on sequential data. In International Conference on Machine Learning, pages 6233–6242. PMLR.
  81. Higher order kernel mean embeddings to capture filtrations of stochastic processes. Advances in Neural Information Processing Systems, 34:16635–16647.
  82. Schweder, T. (1970). Composable markov processes. Journal of applied probability, 7(2):400–410.
  83. Statsmodels: Econometric and statistical modeling with python. In Proceedings of the 9th Python in Science Conference, volume 57, pages 10–25080. Austin, TX.
  84. Model-powered conditional independence test. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  85. The hardness of conditional independence testing and the generalised covariance measure. The Annals of Statistics, 48(3):1514–1538.
  86. Granger causality: A review and recent advances. Annual Review of Statistics and Its Application, 9:289–319.
  87. Singer, H. (1992). Dynamic structural equations in discrete and continuous time. In Economic Evolution and Demographic Change: Formal Models in Social Sciences, pages 306–320. Springer.
  88. Causal interpretation of stochastic differential equations. Electronic Journal of Probability, 19.
  89. Causation, prediction, and search. MIT press.
  90. Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. Journal of Causal Inference, 7(1).
  91. Detecting causality in complex ecosystems. science, 338(6106):496–500.
  92. Measuring and testing dependence by correlation of distances. Ann. Statist. 35 (6) 2769 - 2794.
  93. Permutation testing improves bayesian network learning. In Joint European conference on machine learning and knowledge discovery in databases, pages 322–337. Springer.
  94. Equivalence and synthesis of causal models. In Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence, UAI ’90, page 255–270, USA. Elsevier Science Inc.
  95. D’ya like dags? a survey on structure learning and causal discovery. ACM Computing Surveys, 55(4):1–36.
  96. Dag-gnn: Dag structure learning with graph neural networks. In International Conference on Machine Learning, pages 7154–7163. PMLR.
  97. Kernel-based conditional independence test and application in causal discovery. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, UAI’11, page 804–813, Arlington, Virginia, USA. AUAI Press.
  98. Feature-to-feature regression for a two-step conditional independence test.
  99. Dags with no tears: Continuous optimization for structure learning. Advances in neural information processing systems, 31.
  100. Learning sparse nonparametric dags. In International Conference on Artificial Intelligence and Statistics, pages 3414–3425. PMLR.
Citations (6)

Summary

We haven't generated a summary for this paper yet.