Functional Causal Models

Updated 19 September 2025

Functional Causal Models are a framework defining each variable as a function of its direct causes and an independent noise factor, facilitating causal analysis.
They enable identifiability and robust causal discovery through techniques like IFMOCs, QPE, and functional dependencies, integrating numerical and graphical methods.
FCMs extend to infinite-dimensional and dynamic settings by leveraging kernel methods, alignment algorithms, and Bayesian models for applications in neuroscience and biomedical monitoring.

Functional Causal Models (FCMs) formalize complex causal structures by specifying each variable as a deterministic or random function of its direct causes and, typically, an independent noise variable. FCMs encompass a broad landscape that includes structural equation models, nonlinear and non-Gaussian mechanisms, and infinite-dimensional or functional data structures. They underpin identification theory, causal discovery algorithms, and recent high-dimensional and dynamic causal inference procedures.

1. Mathematical Structure of Functional Causal Models

A classical FCM expresses each variable $X_i$ as

$X_i = f_i(\mathsf{PA}_i, N_i)$

where $\mathsf{PA}_i$ are the parent variables of $X_i$ in the causal graph, $N_i$ is an independent noise variable, and $f_i$ is a potentially nonlinear function. The vector of noise variables $(N_i)_i$ is assumed jointly independent, and the system can be represented as a Directed Acyclic Graph (DAG) or, in extensions, as a directed cyclic graph for feedback systems.

In the functional data context—such as in neuroimaging or longitudinal biomedical monitoring— $X_i$ may take values in an infinite-dimensional space (e.g., $L^2$ or a space of registered curves), and the structural assignments become operator-valued functional relations.

Variations include:

Identifiable Functional Model Classes (IFMOCs): Restrictions on $f_i$ ensuring that for any bivariate system, the direction of causation is identifiable from the joint distribution (Peters et al., 2012).
Functional dependencies: Variables are deterministically defined by their parents ( $\operatorname{Pr}(x | \mathsf{PA}(X)) \in \{0,1\}$ ), facilitating the elimination or projection of such variables for identifiability (Chen et al., 2024).

2. Causal Discovery and Identifiability Results

Several foundational results establish conditions under which FCMs support full identifiability of causal structures:

Markov Condition and Faithfulness: Traditional approaches such as the PC algorithm rely on the factorization of the joint distribution according to the DAG (Markov property) and the assumption that (conditional) independence statements in the observed data reflect the graphical structure (faithfulness) (Peters et al., 2012). However, these only identify graphs up to Markov equivalence.
IFMOCs and Full Identifiability: IFMOCs (Peters et al., 2012) enforce that no local bivariate functional relationship can be inverted without violating noise independence, thus enabling unique identification of the entire DAG from the observational distribution. Key sufficient conditions (for non-causal “flippability” of mechanisms) are characterized via bivariate identifiable set conditions over the function classes.
Quantile Partial Effect (QPE): QPE generalizes FCM identifiability without explicit model assumptions by measuring the effect of covariates at different quantiles of the response distribution. If the QPE lies in a finite linear span, cause and effect are identifiable solely from the observational distribution. This framework generalizes identifiability results for additive, heteroscedastic, and post-nonlinear noise models, and leverages direct features of the observed distributions rather than latent mechanisms (Chen et al., 16 Sep 2025).
Cyclic FCMs and $p$ -Separation: For FCMs on graphs with cycles and finite-cardinality variables, $p$ -separation generalizes $d$ -separation to a sound and complete conditional independence criterion. The unique assignment of distributions is possible even in non-uniquely solvable cyclic systems, provided consistency (existence of solutions) holds (Ferradini et al., 6 Feb 2025).
Functional dependencies and elimination: When some variables are deterministic functions of their parents, elimination via edge-shortening and variable removal—termed "functional elimination"—can unlock identification in previously non-identifiable models and reduce the necessity for measuring every variable (Chen et al., 2024).

3. FCMs for Functional Data and High-Dimensional Outcomes

Recent advances have tailored FCMs to the context of functional data—outcomes or covariates that are curves, images, or more general infinite-dimensional objects.

Fréchet Mean-Based Causal Inference

The framework in (Raykov et al., 6 Mar 2025) generalizes the classical potential outcome approach by defining causal effects via Fréchet means in metric spaces that encapsulate the infinite-dimensional structure of the outcomes:

For treatment $x \in \mathcal{X}$ , the functional Fréchet mean is

$F(\bm{Y}^{(x)}) = \arg\min_{f\in \mathcal{F}} \int_{\mathcal{F}} \phi^2(f, g) d\eta_x(g)$

where $\eta_x$ is the potential outcome distribution under $x$ , and $\phi$ is an appropriate metric (e.g., $L_2$ , Fisher–Rao).

The dynamic average treatment effect (dATE) is then

$\varphi^{dATE} = \phi(F(\bm{Y}^{(1)}), F(\bm{Y}^{(0)}))$

with pointwise difference

$\Delta(t) = F(\bm{Y}^{(1)})(t) - F(\bm{Y}^{(0)})(t)$

RKHS and Operator-Valued Kernel Estimation

Kernel ridge regression is adapted to the functional outcome space through operator-valued kernels:

For discretized or vector-valued outcomes,

$\hat{\varphi}(x) = \frac{1}{n} \sum_{i=1}^n \mathbf{K}_{(x, \bm{v}_i) X} (\mathbf{K}_{XX} + \lambda I_{nT})^{-1} \operatorname{vec}(\bm{Y})$

where $\mathbf{K}_{XX}$ encodes similarities over treatment and covariates, and $\bm{Y}$ is stacked over all individuals and time points (Raykov et al., 6 Mar 2025).

For true functional outcomes, operator-valued kernels map input pairs to bounded operators on the outcome Hilbert space, preserving temporal and spatial correlations across the outcome dimension.

Alignment and Registration

Functional data often display phase variability. To address this, the SRSF (square-root slope function) transformation is deployed, and iterative algorithms align (register) both covariates and outcomes prior to causal effect estimation, ensuring that estimates reflect causal signal rather than temporal misalignment.

Theoretical Guarantees

Under standard conditions (uniqueness of Fréchet mean, Lipschitz continuity), estimators for dATE and pointwise effects are shown to be consistent. Asymptotic normality is established for finite-dimensional projections, enabling valid inference, and can be extended via functional central limit theorems to fully infinite-dimensional settings (Raykov et al., 6 Mar 2025).

4. Generalizations and Connections

Fuzzy Cognitive Maps (FCMs)* (Editor's term)*: In the literature, FCM often also refers to fuzzy cognitive maps—cyclic, signed, and fuzzy-weighted directed graphs used for qualitative and dynamical causal modeling (Osoba et al., 2019, Tyrovolas et al., 2024, Panda et al., 2024). These models, while not typically cast in probabilistic or potential outcomes frameworks, leverage similar functional dependencies and lend themselves to computation of causal paths, phantom node inference, and mixture modeling.
Functional Bayesian Networks: By constructing directed acyclic graphs whose nodes are functional random objects (e.g., elements of $L^2$ ), and employing structural models for basis coefficients (with non-Gaussian errors), causal networks can be learned from noisy, high-dimensional functional data. Non-Gaussianity is crucial for identifiability in these models (Zhou et al., 2022, Roy et al., 2023).
Abstraction and Consistency: The alignment of graphical and functional causal abstractions (e.g., via cluster DAGs and α-abstractions), and the formal transfer of identifiability and consistency results across these representations, are central in the comparability and learnability of FCMs at multiple granularities (Schooltink et al., 2024).

5. Empirical and Applied Contexts

Biomedical Monitoring: Functional causal estimators have been applied to biomedical monitoring scenarios (e.g., Parkinson's disease), where functional outcomes exhibit complex dynamics (e.g., tremor probability or gait energy curves). By leveraging operator-valued kernels and registration, these frameworks allow robust, interpretable inference on dynamic treatment effects, outperforming scalar or vector-based methods in both bias and variance (Raykov et al., 6 Mar 2025).
Neural Connectomics: Statistical FCM frameworks for neural connectomics employ directed Markov properties and functional models to distinguish causal from associative connectivity, facilitating simulation of interventions such as neuron ablation (Biswas et al., 2021).
Policy and Social Systems: Fuzzy cognitive maps and their efficient causal effect calculation algorithms (e.g., TCEC-FCM) have enabled the analysis of causal mechanisms and feedback in policy, economics, and large-scale social systems, accommodating cyclic dependencies and qualitative knowledge integration (Osoba et al., 2019, Tyrovolas et al., 2024, Panda et al., 2024).

6. Methodological Extensions and Future Directions

Kernel-Based Methods for Functional Data: The growth of operator-valued and higher-order kernel methods enables nonparametric inference of functional causal effects, accommodating high-dimensionality and nonlinearities not tractable in classical frameworks (Raykov et al., 6 Mar 2025).
Bayesian Foundation Models: The emergence of foundation models trained via prior-data fitted transformers enables in-context Bayesian causal inference for varied settings (e.g., back-door, front-door, instrumental variable), democratizing model application and uncertainty quantification by amortizing over SCM priors (Ma et al., 12 Jun 2025).
Nonparametric and Robust Testing: Advances in federated causal discovery utilize surrogate variables to model heterogeneity and employ nonparametric tests robust to arbitrary functional forms, accommodating the decentralized and complex nature of modern data repositories (Li et al., 2024).
Causal Discovery via Observational Signatures: The shift to distributional properties such as QPE and Fisher information allows causal direction and order identification from observational data alone, relaxing the mechanistic assumptions of traditional FCMs and extending the applicability of causal inference in complex systems (Chen et al., 16 Sep 2025).

7. Tables

Summary of Estimation Techniques for Functional Causal Effects

Method	Key Ingredient	Functional Data Support
Fréchet Mean Estimator	Metric mean in outcome space	Yes; infinite-dimensional outcomes
RKHS Kernel Ridge Regression	Scalar or vector-valued RKHS kernels	High-dimensional discretization of functions
Operator-Valued Kernel Methods	Kernel operator on Hilbert space	Infinite-dimensional curves, complex alignment
Registration (SRSF) Algorithms	Alignment of curves	Robust to phase and timing variation
Bayesian Foundation Models	Transformer pre-trained on SCM priors	Indirect—works for tabular & extendable to functionals

These methodologies compose a flexible and theoretically grounded toolkit for functional causal inference across finite, infinite-dimensional, nonlinear, and dynamically evolving domains. The field is characterized by ongoing extensions in identifiability, scalable algorithmics, and the accommodation of real-world data complexities.