CUTS+: High-dimensional Causal Discovery from Irregular Time-series (2305.05890v2)

Published 10 May 2023 in cs.LG and stat.ME

Abstract: Causal discovery in time-series is a fundamental problem in the machine learning community, enabling causal reasoning and decision-making in complex scenarios. Recently, researchers successfully discover causality by combining neural networks with Granger causality, but their performances degrade largely when encountering high-dimensional data because of the highly redundant network design and huge causal graphs. Moreover, the missing entries in the observations further hamper the causal structural learning. To overcome these limitations, We propose CUTS+, which is built on the Granger-causality-based causal discovery method CUTS and raises the scalability by introducing a technique called Coarse-to-fine-discovery (C2FD) and leveraging a message-passing-based graph neural network (MPGNN). Compared to previous methods on simulated, quasi-real, and real datasets, we show that CUTS+ largely improves the causal discovery performance on high-dimensional data with different types of irregular sampling.

Authors (7)

Yuxiao Cheng (13 papers)
Lianglong Li (1 paper)
Tingxiong Xiao (8 papers)
Zongren Li (3 papers)
Qin Zhong (4 papers)
Jinli Suo (40 papers)
Kunlun He (15 papers)

Citations (12)

View on Semantic Scholar

Summary

High-Dimensional Causal Discovery in Irregular Time-Series with CUTS+

The paper introduces CUTS+, an advanced methodology for causal discovery in time-series, addressing notable challenges faced when handling high-dimensional data with irregular sampling. Traditional approaches integrating neural networks with Granger causality have demonstrated limited efficacy in such contexts due to excessive network redundancies and large causal graphs. CUTS+ proposes novel solutions to enhance scalability and performance, namely Coarse-to-Fine Discovery (C2FD) and a Message-Passing-based Graph Neural Network (MPGNN) for improving data imputations and causal graph learning.

Key Contributions

Coarse-to-Fine Discovery (C2FD): This technique mitigates the issues arising from a large adjacency matrix intrinsic to high-dimensional data. By initially categorizing time-series into smaller groups, the optimization process becomes more straightforward and computationally feasible. Over time, the groups are incrementally merged to refine the causal graph with greater precision. This hierarchical approach helps manage computational complexity, achieving a balance between generality and specificity without assuming low-rank approximations or enforceable constraints on the data.
Message-Passing-based Graph Neural Network (MPGNN): MPGNN addresses parameter redundancy seen in component-wise MLPs and LSTMs common in current methods. It leverages shared weights across the graph spectra, ensuring effective parameter utilization while respecting the causal dependencies in the data structure. This respects the complexity of input series and efficiently refines the network's representational capability without suffering a loss in performance due to excess parameterization.
Empirical Validation and Performance: By implementing CUTS+ on both simulated and real-world datasets, including various types of missing value scenarios, the method showed superior performance regarding causal discovery precision and computational efficiency. Notably, CUTS+ outperforms existing methodologies in settings with complex dependencies and high-dimensional series. Numerical experiments demonstrate its robustness to varying degrees of irregular sampling, attributing improvements primarily to the novel application of C2FD and MPGNN.

Theoretical and Practical Implications

On a theoretical level, CUTS+ challenges existing paradigms in causal discovery by demonstrating that scalable, high-dimensional time-series analysis can eschew some conventional limitations like conditional independence tests or low-rank assumptions. Practically, it ushers in more expansive application territories, from genomics to atmospheric sciences, where data complexity previously hindered causal graph optimization. Future work poised to expand on CUTS+ may delve into latent variable handling, integration in distributed computing frameworks for handling even larger datasets, and real-time applications in predictive maintenance and cognitive computing environments.

In conclusion, CUTS+ asserts a significant advancement in both methodology and application scope within causal discovery and structural time-series analysis, defining pathways for tackling the innate challenges of scalability and irregular observation handling present in contemporary data environments. Its deployment marks a critical step forward, enhancing both understanding and forecasting capacity in foundational scientific and engineering problems faced in high-dimensional data analysis.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - jarrycyx/UNN: Causal Neural Nerwork (116 stars)