Papers
Topics
Authors
Recent
2000 character limit reached

Convolutive Non-negative Matrix Factorization (cNMF)

Updated 21 November 2025
  • cNMF is a decomposition technique that extends traditional NMF by integrating convolution, enabling analysis of sequential data.
  • It models temporal or spatial patterns effectively, making it suitable for applications like audio processing and time series analysis.
  • The method enforces nonnegativity and convolutional constraints, which lead to improved interpretability and precise feature localization.

Spatiotemporal causal graphical models are mathematical and algorithmic frameworks for representing, inferring, and exploiting the directed dependencies among variables indexed by both space and time. These models extend classical graphical causal models to accommodate systems in which interactions exhibit both temporal precedence and spatial structure, including but not limited to climate, neuroscience, epidemiology, biomedicine, and distributed cyber-physical systems. The core objective is to discover or model the directed acyclic graph (DAG) (or, in some cases, dynamic Bayesian network or SCM) that encodes cause-effect relationships among multivariate time series observed over gridded or networked spatial domains.

1. Conceptual Framework and Model Classes

Spatiotemporal causal graphical models instantiate a space–time–indexed set of random variables, typically denoted Xi(s,t)X_i(s, t) for variable ii at spatial location ss and time tt. The essential building block is a set of nodes—corresponding to variables or subsystems—organized over a spatial domain SS and temporal index set TT. Directed edges (Xj(s,tτ)Xi(s,t))\left(X_j(s', t-\tau) \to X_i(s, t)\right) indicate that the past value of XjX_j at location ss' and temporal lag τ\tau exerts a direct causal effect on XiX_i at (s,t)(s, t) (Supple et al., 30 Oct 2025).

Critical distinctions arise between three classes:

  • Observed-variable graphical models: Each node is an observable physical quantity, and the goal is to directly recover the graph GG from observed Xi(s,t)X_i(s, t) (Supple et al., 30 Oct 2025, Jr. et al., 2010).
  • Latent-variable or factor models: Observed spatiotemporal fields are explained via a set of latent processes Zk(t)Z_k(t) with spatial “footprints,” reducing effective dimensionality and aggregating correlated spatial nodes (Wang et al., 8 Nov 2024).
  • Mechanistic/structural models: Causal relations are encoded in stochastic difference equations, e.g., state-transition rules in SIR epidemics or patient-level latent Markov processes (Lee et al., 11 Jul 2025, Jr. et al., 2010).

The graphical model typically enforces acyclicity within each time slice and forward-time “causal order” for lagged links, sometimes distinguishing between contemporaneous (instantaneous) and lagged causal mechanisms (Supple et al., 30 Oct 2025, Wang et al., 8 Nov 2024).

2. Mathematical Formalisms

The underlying model is often formalized as an SCM or SEM over space–time–indexed variables:

Xi(s,t)=fi({Xj(s,tτ):(j,s,tτ)paG[i,s,t]},U(s),εi(s,t))X_i(s, t) = f_i\left(\left\{X_j(s', t-\tau): (j, s', t-\tau) \in \text{pa}_G[i, s, t]\right\}, U(s), \varepsilon_i(s, t)\right)

where paG[i,s,t]\text{pa}_G[i, s, t] denotes the parent set in GG, U(s)U(s) represents latent spatial confounders (e.g., soil quality), and εi(s,t)\varepsilon_i(s, t) are i.i.d. noise innovations (Supple et al., 30 Oct 2025). In latent decomposition models, observations are generated as

X(s,t)=g(k=1Kzk(t)fk(s))+ε(s,t)X(s, t) = g\left(\sum_{k=1}^K z_k(t) \cdot f_k(s)\right) + \varepsilon(s, t)

where the zk(t)z_k(t) are latent time series, fk(s)f_k(s) are spatial factor functions (e.g., RBF kernels), and g()g(\cdot) is an invertible nonlinearity (Wang et al., 8 Nov 2024).

For discrete-state models (e.g., SIR epidemics), the dependency structure is embedded in the transition probabilities of networked Markov chains (Jr. et al., 2010). In multivariate settings, plate notation may be used to clarify replicated structures over individuals, locations, time, or measurement channels (Lee et al., 11 Jul 2025).

3. Causal Discovery and Inference Methodologies

A principal challenge in spatiotemporal causal discovery is distinguishing direct cause–effect relationships from confounding due to spatial and temporal autocorrelation. Several classes of algorithms have been developed:

  1. Conditional Independence (CI)-based algorithms:
    • Extended PC and PCMCI+ algorithms conduct CI testing over time-lagged and spatially indexed variables, adjusting for spatial confounders by incorporating spatial coordinates or smoothers (e.g., GAMs) as controls in regression-based tests (Supple et al., 30 Oct 2025).
    • Kernel-based CI tests (KCIT) for adjacency learning in spatiotemporal graphs rely on temporal embedding windows, coupled with statistical controls (e.g., SyPI filtering) to select direct causal parents (Mo et al., 25 Nov 2024).
    • These methods typically enforce maximum lag TlagT_\text{lag}, local spatial neighborhoods, and acyclicity constraints.
  2. Variational and latent-factor approaches:
    • Variational autoencoding frameworks (e.g., SPACY) infer causal structure in a low-dimensional latent space, learning spatial factors as kernelized “modes” and a DAG over their time series. The evidence lower bound (ELBO) incorporates KL divergences for Bayesian regularization on latent dynamics, spatial kernels, and the graph itself (Wang et al., 8 Nov 2024).
  3. Penalized likelihood methods:
    • 1\ell_1-regularized maximum likelihood convex programs recover the support of the spatiotemporal dependence matrix directly from discrete event sequences (e.g., SIR transitions), allowing statistically consistent topology selection under suitable sparsity-inducing penalties (Jr. et al., 2010).
  4. Nonparametric and MDL-based regime detection:
    • SpaceTime employs minimum description length (MDL) principles with nonparametric Gaussian process regression to jointly infer causal graph structure, regime (changepoint) locations in time, and context-specific partitions in space. Kernelized Hilbert-Schmidt Independence Criterion (HSIC) is used to segment contexts and regimes (Mameche et al., 17 Jan 2025).
  5. Hybrid and feature-learning methods:
    • Symbolic-dynamics-based pattern networks extract discrete-valued features capturing directed temporal dependencies, summarizing them as feature vectors for unsupervised learning (e.g., Restricted Boltzmann Machines) to encode joint nominal modes and anomaly structure (Liu et al., 2015).

4. Identifiability, Assumptions, and Theoretical Guarantees

Statistical identifiability of spatiotemporal causal graphs relies on several layers of assumptions:

  • Causal Markov and faithfulness on the joint process X(s,t)X(s, t) (Supple et al., 30 Oct 2025).
  • Causal sufficiency up to spatial confounders, i.e., no unmeasured confounders outside smooth spatial latent fields U(s)U(s) or latent regime/context variables (Supple et al., 30 Oct 2025, Mameche et al., 17 Jan 2025).
  • Invertibility and independence in latent factor models, enabling recovery of spatial kernels and time series up to permutation and invertible transform, given linear independence of spatial factors (Wang et al., 8 Nov 2024).
  • Exogeneity of spatial coordinates for spatial adjustment in regression-based CI testing (Supple et al., 30 Oct 2025).
  • Piecewise stationarity, or at least discrete nonstationarity via changepoints or context blocks, with mechanisms persisting within each regime/context (Mameche et al., 17 Jan 2025).

Under these, consistency results assert exact recovery of the true structured DAG (and, where appropriate, spatial and temporal segmentation) as sample size grows, provided kernel, basis, or functional approximations are adequate and minimal regime length is enforced (Wang et al., 8 Nov 2024, Mameche et al., 17 Jan 2025).

5. Computational Considerations and Scalability

Spatiotemporal causal graph learning is inherently high-dimensional, given the combinatorial growth in the number of nodes over locations, measurement channels, and temporal lags. Key algorithmic strategies include:

  • Exploiting time’s arrow to limit the directionality and reduce effective search space, with O(N2kmax)N^2 k_\text{max}) scaling for edge searches where NN is the number of nodes and kmaxk_\text{max} the maximum conditioning set (Supple et al., 30 Oct 2025).
  • Vectorized and parallel computation for kernel evaluations (e.g., RBF-based spatial factors) in variational frameworks, permitting practical grid sizes up to L=104L=10^4 (Wang et al., 8 Nov 2024).
  • Edge screening and pre-selection by simple correlation or spatial proximity priors to sparsify candidate parent sets and reduce the number of conditional independence tests (Mo et al., 25 Nov 2024).
  • Blockwise or coordinate-descent optimization in penalized likelihood frameworks, with closed-form soft-thresholding or blockwise Newton updates (Jr. et al., 2010).
  • Nonparametric regression accelerations such as hierarchical GAMs and thin-plate splines for faster CI regressions (Supple et al., 30 Oct 2025).
  • MDL-based edge addition/removal heuristics for efficient structure selection, re-using fitted conditional models across greedily proposed modifications (Mameche et al., 17 Jan 2025).

6. Empirical Applications and Benchmarks

Spatiotemporal causal graphical models are empirically validated in several domains:

  • Climate science: SPACY discovers latent spatial teleconnection modes (e.g., North Atlantic Oscillation) and infers lagged/instantaneous causal links consistent with literature on El Niño, NAO, and AAO, using global surface temperature fields (Wang et al., 8 Nov 2024). SpaceTime reveals regime and context structure in precipitation-runoff coupling and biosphere-atmosphere fluxes (Mameche et al., 17 Jan 2025).
  • Distributed CPS and anomaly detection: Symbolic-dynamics-based methods coupled to RBMs distinguish nominal from anomalous system-wide patterns in building-integrated systems and synthetic subsector networks (Liu et al., 2015).
  • Epidemiology: 1\ell_1-penalized SIR network recovery achieves high sensitivity and specificity in reconstructing transmission networks from synthetic infection trajectories (Jr. et al., 2010).
  • Biomedical temporal and spatial modeling: Latent Markov and multi-graph approaches encode disease state trajectories and spatially correlated physiological markers, enabling robust doubly-robust treatment effect estimation in clinical data (Lee et al., 11 Jul 2025).
  • Urban mobility and graph-based forecasting: Causal Adjacency Learning reduces out-of-distribution RMSE by up to 50% relative to static adjacency schemes for graph convolutional predictions of human mobility under COVID-induced shifts (Mo et al., 25 Nov 2024).

A tabular summary of representative frameworks and their applications:

Framework/Method Domain Methodological Components
SPACY (Wang et al., 8 Nov 2024) Climate, any gridded Variational inference, spatial kernels, latent SCM, ELBO
SpaceTime (Mameche et al., 17 Jan 2025) Hydroclimate, flux MDL, nonparametric GPs, HSIC, regime/context segmentation
spatial-PCMCI+ (Supple et al., 30 Oct 2025) Ecology, environment GAM/CI regression, spatiotemporal PC algorithm
Causal Adjacency Learning (Mo et al., 25 Nov 2024) Mobility Kernel CI tests (KCIT), pre-selection, SyPI filtering
1\ell_1-SIR (Jr. et al., 2010) Epidemics Convex penalized likelihood, blockwise optimization
STPN+RBM (Liu et al., 2015) CPS, signals Symbolic dynamics, PFSA, RBM free energy detection

7. Limitations and Ongoing Challenges

Several limitations and ongoing research fronts are apparent:

A plausible implication is that future advancements will need to address scalable automated model selection, richer spatial priors, multivariate spatial–temporal correlation structures, and flexible treatment of missing or irregularly sampled data.


Comprehensive references:

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Convolutive Non-negative Matrix Factorization (cNMF).