Centralized Gaussian Linear SCMs
- Centralized Gaussian Linear SCMs (CGL-SCMs) are Gaussian causal models where all exogenous variables are standardized to zero mean and unit variance, reducing parameter complexity.
- They maintain full expressivity and observational equivalence to standard models, allowing for accurate identification and estimation of causal effects using graphical criteria.
- An EM-based algorithm is employed for parameter learning, achieving high-fidelity causal effect estimation from finite sample data through closed-form interventions.
Centralized Gaussian Linear Structural Causal Models (CGL-SCMs) are a subclass of Gaussian Linear Structural Causal Models in which all exogenous variables (i.e., unobserved confounders and noises) are standardized to have zero mean and unit variance. This centralization eliminates the scale and location indeterminacy inherent in standard Gaussian Linear SCMs (GL-SCMs) by reducing the parameterization to a minimal yet fully expressive form. CGL-SCMs retain full expressivity with respect to observational and identifiable interventional distributions, enabling efficient parameter learning and causal effect estimation from finite samples using a specialized expectation–maximization (EM) procedure (Maiti et al., 8 Jan 2026).
1. Formal Specification and Expressivity
A Gaussian Linear SCM (GL-SCM) is defined by a tuple , where are multivariate normal confounders (with diagonal covariance), are independent normal noise terms, and each endogenous variable evolves via
Edges and are present whenever and .
A CGL-SCM is the special case where all exogenous variables are standardized: , , with endogenous variable structure
Centralization removes the means and variances of , yielding a lower-dimensional parameter space.
Expressivity Theorem: For any GL-SCM with observed distribution , there exists a CGL-SCM with the same graph such that . Thus, CGL-SCMs and GL-SCMs are observationally indistinguishable and equally expressive in representing Gaussian-linear observational laws (Maiti et al., 8 Jan 2026).
2. Identifiability of Causal Effects
A query (e.g., ) is identifiable in a linear SCM with known graph if can be expressed uniquely in terms of the observational distribution . Standard identification procedures such as Pearl's do-calculus and linear criteria (including instrument sets and graphical criteria of Brito–Pearl and Tian) extend directly to CGL-SCMs since these depend only on the topology and Gaussianity.
Identification Theorem: For a GL-SCM and corresponding CGL-SCM with , identifiable queries satisfy . This permits one to work always in the lower-dimensional, centralized parameterization without loss for identifiable causal effect estimation (Maiti et al., 8 Jan 2026).
An illustrative example: In the simple chain (no confounders), the CGL-SCM yields , and .
3. EM-Based Parameter Learning Algorithm
To estimate model parameters from data, the CGL-SCM admits a vectorized formulation. Let , , the weighted adjacency matrix of (with ), and the length of the longest directed path. Define
with the total sum of path weights from to , the matrix of edges , and intercepts. The stacked equations are
The joint is jointly Gaussian with explicitly computable mean and covariance.
EM Algorithm Steps:
- E-step: For each data sample , compute
- M-step: Maximize the expected complete-data log-likelihood
with closed-form update for :
Updates for and are performed by masked gradient ascent, preserving the zero-pattern dictated by the graph .
EM guarantees non-decreasing observed-data likelihood at each iteration. Regularization (e.g., -penalties) and early stopping are advisable for small to prevent overfitting (Maiti et al., 8 Jan 2026).
4. Causal Inference and Effect Estimation
After model fitting, causal queries are evaluated by modifying structural equations and computing the resulting Gaussian distribution, as dictated by do-calculus.
Do-Interventions: For intervention , incoming edges to are removed (i.e., zeroed in ), and is set to . Remaining are solved as linear functions of and . The post-interventional distribution remains multivariate normal, with parameters derived from the submatrices of the modified and .
Example (Linear Chain):
| Chain Structure | Total Effect | |
|---|---|---|
| $\mathcal{N}(\mu_Y + B_{X\to Y} x,\;\text{[noise variance of$Y$]})$ |
This pipeline applies to any graph-identifiable query, including counterfactuals, due to the closed-form propagation properties of Gaussian-linear models (Maiti et al., 8 Jan 2026).
5. Empirical Evaluation and Application
Synthetic validation was conducted using the "frontdoor" and "napkin" benchmark graphs:
- Frontdoor graph: (three observed nodes with unobserved confounder )
- Napkin graph: (four observed nodes, two latent confounders)
- In both cases, samples were drawn from known CGL-SCMs.
After learning with the EM algorithm, the estimated causal effects closely matched ground truth. In the frontdoor scenario:
- True , estimated as .
For the napkin graph:
- True , estimated as .
Mean and variance estimates were consistently within a few percent of their true values, demonstrating high-fidelity recovery of causal effects from finite-sample observational data using the CGL-SCM EM algorithm (Maiti et al., 8 Jan 2026).
6. Parameter Reduction and Practical Advantages
CGL-SCMs achieve parameter reduction by standardizing all exogenous variables (confounders and noises) to zero mean and unit variance. This eliminates the latent scaling and location degrees of freedom in general GL-SCMs: the means and variances of are removed from the model specification. As a result, the number of free parameters—particularly those associated with unobserved confounders—is drastically reduced. Despite this, the class retains full expressivity over both observational and graph-identifiable interventional distributions. This simplification is particularly advantageous for finite-sample learning, where overparameterization often leads to infeasible or unstable estimation in the presence of unobserved confounding (Maiti et al., 8 Jan 2026).
The EM-based learning algorithm accommodates this streamlined parameterization and enables efficient estimation of edge-weights and bias terms, ensuring that causal queries remain representable and computable in closed form after training.