Papers
Topics
Authors
Recent
Search
2000 character limit reached

Causal Linearization via Perturbation Responses (CLIPR)

Updated 23 January 2026
  • The paper introduces CLIPR, a framework that uses linear response theory and surrogate observables to extract direct causal links in high-dimensional systems.
  • It employs matrix linearization and ensemble perturbation trials to estimate susceptibilities with identifiability ensured by full rank conditions.
  • The methodology generalizes to nonlinear and chaotic regimes, finding applications in genomics, climatology, neuroscience, and even quantum field theory.

Causal Linearization via Perturbation Responses (CLIPR) is a methodological framework for quantifying and predicting the causal impact of perturbations in dynamical systems, especially where the true forcing is inaccessible and only surrogate observables can be measured. Its formalism encompasses linear response theory, surrogate causality, matrix linearization under drift assumptions, and advanced algorithmic expansions in nonlinear and high-dimensional contexts. CLIPR delivers rigorous tools for extracting direct causal links, ranking the predictive ability of observables, and reconstructing target system responses from partial or indirect data.

1. Linear Response Foundations and Surrogate Causality

The central tenet of CLIPR is rooted in linear response theory, which considers the evolution of a system x˙=F(x)ẋ = F(x) perturbed by a small forcing ϵ⋅G(x,t)\epsilon\cdot G(x, t). The linear response of an observable YY at time tt is given by the susceptibility function

χXY(t)=ddϵ⟨Y(t)⟩ϵ∣ϵ=0χ_{XY}(t) = \left. \frac{d}{d\epsilon} \langle Y(t) \rangle_\epsilon \right|_{\epsilon=0}

where XX indexes the spatial-temporal structure of the applied perturbation, and ⟨⋅⟩ϵ\langle \cdot \rangle_\epsilon denotes expectation under the perturbed measure. The change in the mean of YY for a time-modulated forcing fX(s)f_X(s) is a causal convolution:

Δ⟨Y(t)⟩=∫0tχXY(t−s)fX(s)ds\Delta \langle Y(t) \rangle = \int_0^t χ_{XY}(t-s) f_X(s) ds

This susceptibility can be estimated experimentally via ensemble averages over perturbation trajectories, enabling practical access in high-dimensional and nonlinear systems (Tomasini et al., 2020).

Surrogate causality emerges when the true external forcing is unobserved or ill-defined, necessitating substitution of Δ⟨X⟩\Delta \langle X \rangle as proxy input for reconstructing Δ⟨Y⟩\Delta \langle Y \rangle. The causal kernel HY,X(t)H_{Y,X}(t), derived from the ratio of susceptibilities in frequency space JY,X(ω)=χY,G(ω)/χX,G(ω)J_{Y,X}(\omega) = χ_{Y,G}(\omega) / χ_{X,G}(\omega), is used to reconstruct

Δ⟨Y⟩(t)=∫−∞∞HY,X(t−s) Δ⟨X⟩(s) ds\Delta \langle Y \rangle(t) = \int_{-\infty}^\infty H_{Y,X}(t-s)\, \Delta \langle X \rangle(s)\, ds

with strict causality imposed for prognostic validity (HY,X(t<0)=0H_{Y,X}(t < 0) = 0).

2. Matrix Linearization and Identifiability

In models such as Latent Causal Diffusion (LCD) for single-cell gene expression, CLIPR is applied to SDEs with perturbative shifts in the drift term:

dxt=fq(xt) dt+σ dWtd\mathbf x_t = f_q(\mathbf x_t)\,dt + \sigma\,d\mathbf W_t

Assuming a linear drift f(x)=Ax+bf(\mathbf x) = A \mathbf x + b and perturbations as additive shifts cqc_q, the stationary mean under perturbation qq is μq=−A−1(b+cq)\mu_q = -A^{-1}(b + c_q). The CLIPR estimator for the direct causal effect matrix AA uses measurements of initial drift responses F0=[b+cqi]F^0 = [b + c_{q_i}] and limit equilibria F∞=[μqi]F^\infty = [\mu_{q_i}] for multiple perturbations:

A^=−F∞(F0)+\widehat{A} = -F^\infty (F^0)^+

where +^+ denotes the Moore–Penrose pseudoinverse, optionally regularized for stability (Lorch et al., 20 Jan 2026). Identifiability is guaranteed if the perturbation shifts cqic_{q_i} span the state space, ensuring F0F^0 is full rank. This framework yields sparse, interpretable causal matrices robust to measurement noise and capable of generalization to unseen perturbations.

3. Algorithmic Protocols and Surrogate Model Construction

Implementing CLIPR in both stochastic nonlinear systems and high-dimensional experimental datasets involves the following procedural steps:

  1. Test Forcings Selection: Design local and/or global forcings Gj(x)G_j(x), typically as spatially localized or integrative patterns.
  2. Susceptibility Estimation: Generate ensemble trials (using impulses or pseudo-perturbations) to empirically estimate χXl,Gj(ω)χ_{X_l, G_j}(\omega) for candidate predictors and targets.
  3. Predictor Selection: Define a set of surrogate observables XlX_l (local variables, aggregated fields, latent state features).
  4. Kernel Construction: Compute JY,l(ω)J_{Y,l}(\omega), invert to obtain HY,l(t)H_{Y,l}(t), and transform to causal kernels.
  5. Causality Ranking: Evaluate the Predictability Index (PI) for each kernel or subset:

R=∥Knc∥1∥Kc∥1+∣s∣R = \frac{\lVert K^{nc} \rVert_1}{\lVert K^c \rVert_1 + |s|}

PI quantifies non-causal leakage; smaller RR indicates higher surrogate prognostic utility.

  1. Response Reconstruction: For any new forcing, predict the target response as

Δ⟨Y⟩(t)=∑lHY,lc∗Δ⟨Xl⟩\Delta \langle Y \rangle(t) = \sum_l H^c_{Y,l} * \Delta \langle X_l \rangle

Optionally, sparsity-promoting regularization (e.g., â„“1\ell_1) can be used for optimal predictor subset selection (Tomasini et al., 2020).

For nonlinear or chaotic regimes, CLIPR employs surrogate models—sparse regression (SINDy), nested NNs, or reservoir computing—and simulates virtual perturbations through fitted dynamics to estimate susceptibilities (Chibbaro et al., 9 Sep 2025). Direct perturbation experiments provide higher precision, but virtual approaches remain valid under stationarity and smallness of intervention.

4. Practical Performance and Empirical Findings

Empirical evaluation of CLIPR demonstrates:

  • Near-exact causal edge recovery (AUROC ≈ 0.9) in simulated linear systems with sufficient perturbation diversity (Lorch et al., 20 Jan 2026).
  • Module clustering and pathway enrichment in Perturb-seq data, revealing gene–gene regulatory structure beyond differential expression analysis.
  • Superior disambiguation of direct versus indirect effects, with a substantial increase (lift ≈ 4–5× for top predicted links) in observed downstream DE upon perturbation.
  • Robustness to holding out unperturbed genes: CLIPR generalizes causal predictions even to targets not seen in the training set.
  • In spatially extended chaotic systems (Lorenz ’96), PI reveals the gradient of causal influence propagation; nearest-neighbor variables serve as the most predictive surrogates with negligible non-causality (R ≈ 0), while distant variables show rapid PI decay. Adding ensemble predictors (local plus global observables) recovers deeply non-local causal forecasting (Tomasini et al., 2020).

Computational cost and sample complexity scale favorably in linear regimes: N≳10dN \gtrsim 10d suffices for ridge estimation, and sparse regression/NN models are well-controlled in moderate dimensions (Chibbaro et al., 9 Sep 2025).

5. Advanced Generalizations and Theoretical Extensions

CLIPR extends to causal variational principles and fragmentation schemes encountered in quantum field theoretic and continuum limits. In causal fermion systems, perturbation theory proceeds by expanding universal measures under weight/diffeomorphism actions, diagonalizing fluctuations across degenerate subspaces, and reconstructing linearized jets via Green’s operators. Convex combinations ("fragmentation") of measures allow simultaneous tracking of subsystem means and fluctuations, enabling perturbative analysis of microscopic mixing and synchronization between fragments (Finster, 2017).

The algorithmic expansion is as follows:

  1. Choose a critical measure and jet space for the response operator Δ\Delta.
  2. Allow fragmentation into LL subsystems: $\rhõ = \tfrac{1}{L} \sum_a (F_a)_*(f_a \rho)$.
  3. Expand local jets, split into mean and fluctuation components.
  4. Use Green’s operators to solve inhomogeneities iteratively.
  5. Reconstruct perturbed measures and explore consequences for gauge and gravity theories, as well as particle–field interactions.

6. Limitations, Validity Conditions, and Recommendations

CLIPR’s validity rests on several critical assumptions:

  • Stationarity and mixing of the underlying system; detrending and normalization are essential preprocessing steps.
  • Smallness of perturbation magnitude (ϵ≪1\epsilon \ll 1) to ensure the linear response approximation.
  • Full rank of perturbation-induced responses for identifiability in matrix estimation.
  • Applicability of surrogate models is restricted to interpolation within the observed data domain; extrapolation risks bias unless the dynamics are globally stable.
  • In chaotic systems, long-time susceptibility estimation incurs exponential variance growth, limiting forecast horizons.

Regularization (ridge, â„“1\ell_1 sparsity) is recommended in high-dimensional parameter spaces to mitigate overfitting. Fragmentation analysis requires careful diagonalization on degenerate subspaces and error control beyond leading order.

7. Domain-Specific Applications

CLIPR finds broad utility in fields including:

  • Single-cell perturbation genomics: Extraction of gene–gene causal effect matrices, causal module identification, prediction of transcriptome-wide perturbation responses (Lorch et al., 20 Jan 2026).
  • Physical and climatological systems: Surrogate-based causal forecasting in spatially extended chaotic models; quantification of propagative signal causality with localized observables (Tomasini et al., 2020).
  • Neuroscience and temporal networks: Machine-learning surrogates for causal graph estimation in stochastic nonlinear integration, outperforming Granger causality in regimes with hidden common drivers or strong nonlinearity (Chibbaro et al., 9 Sep 2025).
  • Quantum field theory and variational principles: Formal expansions for linearized field equations, fragmentation-based synchronization, and continuum-limit correspondence with classical Dirac–Maxwell perturbation theory (Finster, 2017).

CLIPR thus constitutes a systematic protocol for leveraging partial observations and learned surrogates in the inference of causal dynamics, ranking predictors by their causal informativeness, and achieving reliable prediction and mechanistic insight in complex, high-dimensional systems.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Causal Linearization via Perturbation Responses (CLIPR).