Extracting Governing Equations from Latent Dynamics via Multi-View Contrastive Learning

Published 11 Jun 2026 in cs.LG and q-bio.NC | (2606.13260v1)

Abstract: Identifying latent dynamical systems from noisy, high-dimensional measurements is a central problem at the intersection of representation learning, system identification, and scientific discovery. We present DYSCO, a multi-view temporal contrastive learning algorithm that jointly recovers latent trajectories and the governing dynamics from such observations, by leveraging multiple independent noisy views of the same underlying process to disentangle signal from noise. By parameterizing the dynamics in a structured functional basis, our framework further enables symbolic recovery of the governing equations within an affine gauge. We offer theoretical guarantees for strong identification up to an affine indeterminacy, extending prior identifiability results to the realistic setting of noisy nonlinear observations. Empirically, we demonstrate accurate recovery of both latent trajectories and flow fields across a diverse set of dynamical regimes (e.g., chaotic, oscillatory, and metastable) under both Gaussian and Poisson observation noise, the latter being particularly relevant for neural recordings.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces the DYSCO algorithm, which recovers latent dynamics and governing equations from noisy measurements using multi-view temporal contrastive learning.
It employs a structured functional basis and robust multi-view denoising to achieve theoretical identifiability up to an affine transformation, yielding high R² scores in diverse systems.
The approach extends system identification to realistic noisy scenarios and enables interpretable symbolic regression for scientific discovery in complex dynamical regimes.

Extracting Governing Equations from Latent Dynamics via Multi-View Contrastive Learning

Problem Formulation and Motivation

The identification of latent dynamical systems from noisy, high-dimensional measurements presents a fundamental inverse problem in representation learning, system identification, and scientific modeling. The majority of observable systems, such as neural population activity or complex physical environments, exhibit a low-dimensional latent structure obscured by nonlinear and noisy observation channels. Traditional approaches to system identification either rely on direct access to underlying states or assume clean observation, which is incompatible with most real-world data. The challenge is further conceptualized in Marr’s tri-level framework, emphasizing the progression from implementation (raw activity) to algorithmic descriptions (governing dynamics).

The paper introduces DYSCO, a multi-view temporal contrastive learning algorithm that leverages multiple independent noisy views of the same process to jointly recover latent trajectories and governing equations. The theoretical foundation is established via a structured functional basis parameterization, enabling symbolic recovery within an affine indeterminacy. Crucially, the algorithm provides theoretical identifiability guarantees under realistic noisy, nonlinear observations, extending prior results to more practical scenarios. Empirical evaluation spans diverse dynamical regimes under both Gaussian and Poisson noise, pertinent to neuroscience applications.

Figure 1: Graphical overview of the latent dynamical system, showing the unknown dynamics $f$ , nonlinear observation channel $g$ , and the joint learning of encoder $h$ and symbolic dynamics $\hat f$ via multi-view contrastive loss.

Model Architecture and Training Objective

DYSCO formulates the system identification task as follows: latent states $\bm{x}_t$ evolve via unknown nonlinear dynamics $f$ , and produce observed high-dimensional measurements $\bm{y}_t^a$ through nonlinear injective mixing $g$ with additive noise. Multiple independent views $a$ allow learning an encoder $h$ and a dynamics model $g$ 0 represented in a chosen functional basis, typically all monomials up to a specified degree. This enables subsequent symbolic regression for explicit recovery of governing equations.

The temporal contrastive objective exploits both time evolution and cross-view consistency. The multi-view setting forces the encoder to discard noise, relying only on components shared across views. Rollouts of the learned dynamics are compared using a similarity function ( $g$ 1), typically negative Euclidean distance, across different time horizons. The theoretical analysis demonstrates that, under infinite views and trajectory length, the system is identifiable up to a common affine transformation—an extension of previous results [laiz2025].

Theoretical Guarantees and Identifiability

The primary theoretical contribution is a rigorous proof that multi-view contrastive learning with noisy nonlinear observations identifies latent states and deterministic dynamics up to affine gauge freedom. The paper states and proves that, for both the encoder and dynamics model, there exists $g$ 2 and $g$ 3 such that the learned representation $g$ 4 and the learned dynamics satisfy $g$ 5. This affine indeterminacy is compatible with downstream symbolic regression under appropriate functional bases (e.g., polynomials) due to closure under affine transformations.

Empirical Results

The framework is empirically validated on diverse nonlinear dynamical systems: Duffing oscillator, Lorenz attractor, FitzHugh-Nagumo, Winner-Take-All, Double-Well, Stuart-Landau oscillator, and Heteroclinic systems. Across configs with Gaussian and Poisson noise, DYSCO achieves high $g$ 6 and $g$ 7 values, confirming accurate recovery of both latent trajectories and flow fields. Notably, flow field identification remains robust under substantial observational noise, exceeding performance of previous contrastive learning methods (e.g., DYNCL [laiz2025]) which degrade sharply in noisy settings.

Figure 2: Phase space portraits comparing ground-truth dynamical systems (top row) and DYSCO-recovered systems (bottom row) under Poisson observation noise; color indicates external forcing magnitude.

Ablation studies demonstrate monotonic improvement as the number of available views increases; even with limited views, trajectory recognition remains strong, especially in chaotic regimes due to dense sampling of phase space. The model is resilient to increasing noise intensity up to a threshold ( $g$ 8 dB), with longer time integration horizons further improving performance.

Figure 3: Ablation showing (a) effect of number of views and noise condition on Lorenz trajectory and flow-field $g$ 9, and (b) robustness to growing observation noise intensity.

Symbolic recovery is explored by identifying the sparsest representative within the affine orbit of coefficient representations, facilitated by the closure properties of polynomial bases under affine transformations. While exact symbolic term recovery is challenging under noise due to gauge mixing, the method provides a principled pathway for interpretable model discovery, with promising proof-of-concept results.

Practical and Theoretical Implications

The primary practical implication is the extension of system identification and symbolic recovery to settings with indirect, noisy, high-dimensional observations—crucial in neuroscience (trial-structured data, neural recordings), engineering, and physics. Theoretically, the work advances the understanding of how structured contrastive objectives perform inversion of generative processes and enforce denoising via cross-view consistency. The compatibility of affine gauge freedom with symbolic regression paves the way for interpretable modeling in self-supervised frameworks.

These results suggest that explicitly structured contrastive learning constitutes a promising route for solving Marr’s inverse problem in scientific modeling, transitioning from noisy observations to interpretable, algorithm-level descriptions. Additionally, ablation studies reveal trade-offs between computational cost and integration horizon, informing practical deployments.

Figure 4: Ablation on time integration horizon $h$ 0 for Lorenz system; longer horizons yield improved trajectory and flow-field $h$ 1, with diminishing returns and increased computational cost.

Limitations and Outlook

The principal limitation is benchmarking on simulated data with known ground truth and selected bases/dimensionality. Real-world applicability, particularly in neuroscience, awaits further demonstration. The method also assumes knowledge of latent dimensionality and expressive symbolic basis; incorrect specification can hinder accuracy. Future work might address robust gauge-aware sparse regression, improved symbolic recovery amidst noise, and estimation of latent structure from data.

More broadly, the approach motivates development of interpretable system identification protocols using multi-view temporal contrastive learning, with downstream symbolic regression for both scientific discovery and robust modeling in high-noise, high-dimensional environments. Extensions to adaptive basis selection, estimation of latent dimensionality, and principled noise handling will further strengthen real-world performance.

Conclusion

DYSCO introduces a principled contrastive learning framework for latent dynamical system identification from noisy, high-dimensional observations, featuring theoretical guarantees of identifiability up to affine gauge, empirical robustness across dynamical regimes and noise conditions, and compatibility with symbolic regression for interpretable model extraction. The multi-view denoising constraint and structured parameterization collectively enable practical and theoretically sound extraction of governing equations, offering a significant advance for scientific modeling in realistic observation settings (2606.13260).

Markdown Report Issue