Causal FM-Based Models: Theory & Applications
- Causal FM-based models represent each variable as a function of its causes and independent noise, ensuring unique recovery of causal directions.
- They utilize identifiable functional model classes that exploit nonlinearities and non-Gaussian noise to overcome limitations of traditional conditional independence methods.
- Practical algorithms, such as iterative sink node removal and residual independence testing, enable scalable discovery of causal structures in multivariate, temporal, and functional data.
Causal FM-based models—formally, Causal Functional Model-based models—are a family of approaches in causal discovery and inference that explicitly leverage functional relationships between variables to recover underlying causal structure, reason about interventions, and make counterfactual predictions. Unlike purely statistical or conditional independence-based methods, FM-based approaches encode and exploit the asymmetries inherent in the data-generating mechanisms via explicit functional equations, enabling more powerful identification results and practical algorithms—especially in the presence of nonlinearities or non-Gaussianity.
1. Foundations and Classes of Causal Functional Models
At the core of FM-based causal discovery is the representation of each variable as a function of its direct causes (parents) and a jointly independent noise variable:
where is a functional mechanism, denotes the parent set of , and is a noise/exogenous variable, mutually independent across .
A central advance is the introduction of Identifiable Functional Model Classes (IFMOCs). An IFMOC is a model class where for each variable, the data-generating functional relationship exhibits bivariate identifiability: if for , there is no such that with . This asymmetry means that causal direction can be uniquely recovered from the functional and noise distributions—an identifiability result that extends far beyond traditional linear or additive models and is not restricted to Markov equivalence classes (Peters et al., 2012).
Functional Causal Models encompass multiple structures and data regimes, including:
- Additive Noise Models (ANMs): , where identifiability holds for non-Gaussian and nonlinear .
- Post-Nonlinear Models: , further generalizing ANMs.
- Nonparametric Gaussian Process Convolution Models (GPCM/CGPCM): Time series generated by convolving past latent noise with nonparametric filters, enforcing causality by allowing only past-to-future influence (Bruinsma et al., 2018).
- Multivariate and Functional Data Models: Extensions to random vectors or infinite-dimensional functions in Hilbert spaces, using orthonormal basis expansions and linear/non-Gaussianity to obtain identifiability (Yang et al., 17 Jan 2024, Zhou et al., 2022, Roy et al., 2023).
2. Identifiability: Theoretical Guarantees and Key Results
A defining feature of FM-based models is the capacity—under appropriate assumptions—to uniquely identify the full causal Directed Acyclic Graph (DAG) from observational data. The main identifiability theorem (as formalized for IFMOCs) states:
If the joint distribution is induced by a functional model in a –IFMOC with a causal graph , then no alternative functional model based on a different DAG in the same class can induce the same distribution (Peters et al., 2012).
Key mathematical elements for identifiability:
- Bivariate identifiability: The absence of confounding functional-reversal representations.
- Local identifiability: Restrictions on function classes after conditioning on parent sets (Definition 4 in (Peters et al., 2012)).
- Non-independence in the effect variable: implies directionality.
Proofs are structured by contradiction: assuming two distinct DAGs produce the same joint distribution, iterative application of the Markov property and residual independence tests on sink nodes reveals inconsistencies.
Functional extensions—such as Func-LiNGAM—generalize these results to infinite-dimensional spaces, proving identifiability for acyclic models over random functions (Hilbert spaces) under non-Gaussianity (Yang et al., 17 Jan 2024).
3. Practical Algorithms for Causal FM-Based Structure Discovery
Algorithmic strategies within FM-based models exploit both the functional structure and the independence of noise terms. Major components include:
- Iterative Sink Node Removal: Prune variables whose dependencies can be separated using independence tests, leveraging the functional model class and bivariate identifiability.
- Residual Independence Testing: Fit a candidate functional model for each variable given its putative parents and test residual independence (e.g., using Hilbert–Schmidt Independence Criterion (HSIC)) to validate the FMOC assumption.
- Model Selection: Employ search strategies (often depth-first) to construct the full DAG by checking candidate structures for residual independence, terminating at “I do not know” if no model in the class fits adequately.
- Bayesian Approaches for Functional Data: For multivariate functional data, use Markov chain Monte Carlo (MCMC) inference on basis coefficients, graph structure, and hyperparameters under spike-and-slab priors, including uncertainty quantification via posterior samples (Zhou et al., 2022).
For time series, methods such as the Causal Gaussian Process Convolution Model (CGPCM) incorporate causal filtering, nonparametric kernel learning, and sophisticated variational inference schemes to capture and infer the underlying stationary or non-stationary causal mechanisms (Bruinsma et al., 2018, Rahmani et al., 20 Jun 2025).
4. Comparison with Conventional Causal Discovery Paradigms
Traditional algorithms (e.g., PC, GES) depend on the Markov condition and faithfulness, which permit identification of the Markov equivalence class rather than the full DAG and cannot generally distinguish between causally equivalent models differing only in orientation of certain edges. These conditional independence-based methods are fundamentally limited in non-linear or non-Gaussian regimes and cannot be directly tested from the data (Peters et al., 2012).
In contrast, FM-based models—by leveraging identifiability from functional and noise class asymmetries—uniquely recover the DAG and facilitate practical testing (by regressing and checking noise independence). Testing for IFMOC membership or model fit is typically more data-driven and transparent: if independence in residuals cannot be recovered under any causal ordering in the functional class, the method abstains from making inferences, thus controlling false discoveries.
For functional and high-dimensional data, linear non-Gaussian acyclic models (LiNGAM) and their functional extensions (Func-LiNGAM, FLiNG-BN) provide identifiability where conventional approaches fail, as the non-Gaussianity assumption breaks the symmetry otherwise indistinguishable under Gaussianity (Yang et al., 17 Jan 2024, Zhou et al., 2022).
5. Extensions: Temporal, Functional, and Feedback Modeling
Causal FM-based approaches have been extended to a wide variety of domains and data structures:
- Functional Data Analysis (FDA): Representing multivariate functions as expansions in orthonormal bases, applying linear or non-linear functional SEMs, and achieving identifiability under non-Gaussianity (Yang et al., 17 Jan 2024, Roy et al., 2023). Bayesian models for functional data use adaptive basis learning, hierarchical modeling, and explicit error modeling (e.g., mixtures of Gaussians).
- Temporal Causal Inference: FM-based models for stationary and non-stationary time series, such as CGPCM and FANTOM, handle heteroscedasticity, regime changes, and latent confounders by learning functional dynamics, adaptive causal graphs across regimes, and incorporating time segmentation and Bayesian EM techniques (Bruinsma et al., 2018, Rahmani et al., 20 Jun 2025).
- Feedback and Cyclic Models: While classical FMs often assume acyclicity, extensions to directed cyclic graphs (DCG) with functional SEMs have been developed, with identifiability results extending under conditions such as disjoint cycles and non-Gaussian noise (Roy et al., 2023).
- Alternative Representations: Fuzzy Cognitive Maps (FCMs) represent another generalization, modeling feedback systems with fuzzy-valued causal edges and forward-chaining nonlinear updates, optimized for scalability and pattern prediction in highly interconnected domains (Osoba et al., 2019).
6. Applications and Implications
The FM-based modeling paradigm has broad applicability in scientific domains where causal inference from rich observational data is essential:
- Economics and Finance: Inferring policy effects, feedback in market systems, causal structure in asset returns, and robust feature selection for forecasting using invariant causal prediction frameworks (Oliveira et al., 19 Aug 2024).
- Neuroscience: Mapping brain effective connectivity (fMRI/EEG) using functional LiNGAMs and Bayesian networks with non-Gaussian latent errors, enabling discovery of directional interactions in high-dimensional functional measurements (Yang et al., 17 Jan 2024, Zhou et al., 2022, Roy et al., 2023).
- High-Frequency and Continuous-Time Data: Leveraging causal functional models in continuous time Bayesian networks enables fine-grained modeling of causality in financial tick data and other temporally resolved signals (Hallgren et al., 2016).
A significant implication of the FM-based approach is the capability to ascertain the direction of causality and fully recover the ordering of the system’s variables, even in nonlinear and high-dimensional regimes where statistical conditional independence methods are insufficient. This enables causal queries—e.g., effect of interventions, counterfactual reasoning, and robust uncertainty quantification—directly from the functional representations.
7. Future Directions and Open Challenges
Promising research directions include:
- Integration with Bayesian Structure Learning: Extending FM-based approaches to partially IFMOC-amenable graphs and hybrid systems, incorporating priors over functional forms and uncertainty about both structure and functional mechanisms.
- Scalability and Efficient Learning: Adapting the practical algorithm to very high-dimensional systems, leveraging advances in normalizing flows, neural approximators, and distributed regressors.
- Nonlinear and Nonstationary Dynamics: Expanding functional modeling frameworks to handle more general classes of time-varying, nonlinear causal mechanisms with latent variables or non-i.i.d. noise.
- Connection to Optimal Transport and Dynamical Systems: Recent advances frame FCMs as transport maps and link identifiability criteria to properties such as volume preservation under dynamical flow, suggesting novel algorithmic and theoretical insights (Tu et al., 2022).
- Interpretability, Causal Feedback, and Human-in-the-Loop Discovery: Development of frameworks, such as Fuzzy Cognitive Maps, that explicitly incorporate human reasoning, feedback, and expert aggregation in causal modeling (Osoba et al., 2019).
Causal FM-based models thus form a theoretical and algorithmic foundation for advanced causal inference in modern data-rich applications. Their ability to uniquely identify underlying causal structure in challenging data regimes marks them as a key methodology for both computational and applied causal discovery.