Papers
Topics
Authors
Recent
2000 character limit reached

DA-SHRED: Shallow Recurrent Decoder Assimilation

Updated 3 December 2025
  • The paper presents a latent assimilation framework that compresses high-dimensional states into a low-dimensional space for real-time reconstruction.
  • It combines a shallow encoder-decoder with a recurrent model and Kalman-style updates to integrate sparse sensor data and simulation proxies.
  • Sparse regression (SINDy) identifies missing dynamical terms, achieving a significant reduction in RMSE and bridging the SIM2REAL gap.

Data Assimilation with a SHallow REcurrent Decoder (DA-SHRED) is a machine learning framework designed to integrate sparse sensor data with computational simulation models for high-dimensional, spatiotemporal physical systems. It operates by embedding the full system state into a low-dimensional latent space, enabling real-time reconstruction and discrepancy modeling between model predictions and experimental measurements. The methodology addresses the simulation-to-real (SIM2REAL) gap introduced by unmodeled physics and parameter misspecification, providing both assimilation and identification of missing dynamics through sparse-regression in the latent space (Bao et al., 1 Dec 2025).

1. Problem Formulation and Mathematical Framework

DA-SHRED considers a high-dimensional system state xtRnx_t \in \mathbb{R}^n evolving under unknown real physics. Available resources are sparse point-sensor measurements ytRpy_t \in \mathbb{R}^p and a reduced simulation proxy NN that approximates the true system dynamics, x˙=N(x,t)\,\dot{x} = N(x, t)\,. Observations are modeled as yt=Hxt+ηt\,y_t = H x_t + \eta_t\,, with HRp×nH \in \mathbb{R}^{p \times n} a known linear observation operator and ηt\eta_t measurement noise.

The dual objectives are:

  • Assimilate incoming measurements yty_t into a reduced latent representation ztRr(rn)z_t \in \mathbb{R}^r\,\,(r \ll n) to reconstruct the full state x^txt\hat{x}_t \approx x_t in real time.
  • Discover missing or unmodeled dynamics L(x,t)L'(x, t) such that the true dynamics are x˙=N(x,t)+L(x,t)\,\dot{x} = N(x, t) + L'(x, t)\,.

The framework employs:

  • A shallow encoder E:RnRrE: \mathbb{R}^n \to \mathbb{R}^r, zt=E(xt;θE)\,z_t = E(x_t; \theta_E)
  • A recurrent latent model frec:RrRrf_{\text{rec}}: \mathbb{R}^r \to \mathbb{R}^r, zt+1f=frec(zta;θrec)\,z_{t+1}^f = f_{\text{rec}}(z_t^a; \theta_{\text{rec}})
  • A shallow decoder D:RrRnD: \mathbb{R}^r \to \mathbb{R}^n, x^t=D(zt;θD)\,\hat{x}_t = D(z_t; \theta_D)

Superscripts f,af, a denote forecast and analysis, respectively.

2. SHRED Architecture and Implementation

SHRED employs an encoder-decoder sequence without a traditional autoencoder inverse. The encoder EE is either a single linear layer or a small MLP mapping full-state snapshots into a low-dimensional latent space. The decoder DD is shallow, typically a single linear layer (possibly with a nonlinearity), that reconstructs the full grid from latent codes.

Temporal dynamics in latent space are captured via frecf_{\text{rec}}, usually instantiated as an LSTM or small RNN:

zt+1f=frec(zta;θrec)z_{t+1}^f = f_{\text{rec}}(z_t^a; \theta_{\text{rec}})

For simulation-only training, reconstruction is enforced via:

  • zt=E(xt;θE)z_t = E(x_t; \theta_E)
  • zt+1f=frec(zt;θrec)z_{t+1}^f = f_{\text{rec}}(z_t; \theta_{\text{rec}})
  • x^t+1=D(zt+1f;θD)\hat{x}_{t+1} = D(z_{t+1}^f; \theta_D)

with mean-square error minimization over simulated trajectory {xt}t\{x_t\}_t.

3. Latent Data Assimilation Procedure

At each time step, the procedure executes:

  • Forecast: ztf=frec(zt1a;θrec)z_t^f = f_{\text{rec}}(z_{t-1}^a; \theta_{\text{rec}})
  • Innovation: ytHD(ztf;θD)y_t - H D(z_t^f; \theta_D)
  • Analysis update: zta=ztf+Kt[ytHD(ztf)]z_t^a = z_t^f + K_t [y_t - H D(z_t^f)], with KtRr×pK_t \in \mathbb{R}^{r \times p} as the gain matrix mapping innovations to latent corrections.

Post-update, full-state is decoded: x^t=D(zta;θD)\hat{x}_t = D(z_t^a; \theta_D), supporting comparisons in sensor or full-domain space.

4. Discrepancy Modeling via Sparse Identification

DA-SHRED includes a sparse regression stage to model missing physics in latent space using SINDy (Sparse Identification of Nonlinear Dynamics). For an assimilated latent trajectory {zta}\{z_t^a\}, finite-difference approximations yield z˙t\dot{z}_t.

Missing latent dynamics are hypothesized to be sparse in a dictionary Θ(Z)\Theta(Z) of candidate nonlinear functions. SINDy regression solves:

minΞZ˙Θ(Z)Ξ22+λΞ1\min_{\Xi} \|\dot{Z} - \Theta(Z) \Xi\|_2^2 + \lambda\|\Xi\|_1

where Θ(Z)Rm×q\Theta(Z) \in \mathbb{R}^{m \times q}, ΞRm×r\Xi \in \mathbb{R}^{m \times r}, and nonzero entries of Ξ\Xi identify active nonlinearities. Physical corrections L(x,t)L'(x, t) are projected back to physical space via the decoder basis.

5. Training Objectives and Joint Optimization

The overall learning problem jointly tunes:

  • Encoder-decoder parameters (θE,θD)(\theta_E,\,\theta_D)
  • Latent recurrent model θrec\theta_{\text{rec}}
  • Assimilation gains {Kt}\{K_t\}
  • SINDy coefficients Ξ\Xi

The main loss components are:

  1. Simulation-only reconstruction:

Lrec=tD(frec(E(xt;θE);θrec);θD)xt+122L_{\text{rec}} = \sum_t \| D(f_{\text{rec}}(E(x_t; \theta_E); \theta_{\text{rec}}); \theta_D) - x_{t+1} \|_2^2

  1. Data-assimilation loss:

LDA=tHD(zta;θD)yt22L_{DA} = \sum_t \| H D(z_t^a; \theta_D) - y_t \|_2^2

  1. Discrepancy (SINDy) loss:

LSINDy=z˙Θ(Z)Ξ22+λΞ1L_{\text{SINDy}} = \| \dot{z} - \Theta(Z) \Xi \|_2^2 + \lambda\|\Xi\|_1

Combined optimization:

minθE,θD,θrec,{Kt},ΞLrec+αLDA+βLSINDy\min_{\theta_E,\,\theta_D,\,\theta_{\text{rec}},\,\{K_t\},\,\Xi} \quad L_{\text{rec}} + \alpha\,L_{DA} + \beta\,L_{\text{SINDy}}

with α,β\alpha, \beta as weighting hyperparameters.

6. Representative Test Cases and Quantitative Evaluation

Empirical evaluations cover:

  • 2D damped Kuramoto–Sivashinsky (KS) system on [0,64]2[0,64]^2
  • 2D Kolmogorov flow (Navier–Stokes with sinusoidal forcing)
  • 2D Gray–Scott reaction–diffusion system
  • 1D rotating detonation engine (RDE) model

Metrics include full-field RMSE, RMSEfull(t)=x^txt2/n\,\text{RMSE}_\text{full}(t) = \| \hat{x}_t - x_t \|_2 / \sqrt{n}, and sensor RMSE, RMSEsens(t)=Hx^tyt2/p\,\text{RMSE}_\text{sens}(t) = \| H\hat{x}_t - y_t \|_2 / \sqrt{p}\,.

Key outcomes:

  • DA-SHRED achieves %%%%52L(x,t)L'(x, t)53%%%% reduction in full-field RMSE within O(10O(10–$20)$ time units, compared to the simulation-only proxy.
  • Robust correction with few sensors: p=3p=3 simulated, q=5q=5–$20$ real.
  • SINDy module precisely recovers missing dynamical terms, e.g., (v)uγu(v \cdot \nabla)u - \gamma u in KS, αω\alpha \omega in Kolmogorov flow, U2VU^2V in Gray–Scott, u3u^3 in RDE.

7. Synthesis, Practical Implications, and Extensions

DA-SHRED unites three major components:

  1. Efficient compression of high-dimensional PDE states via a shallow encoder–recurrent–decoder structure yielding a compact latent representation amenable to rapid computation.
  2. Latent assimilation loop implementing Kalman-style updates for incorporating sparse, noisy sensor data in real time.
  3. Physics-informed discrepancy inference through sparse regression (SINDy) in latent coordinates, facilitating explicit identification of missing or uncaptured processes.

This synergy supports robust closure of the SIM2REAL gap—empirically %%%%63L(x,t)L'(x, t)64%%%% RMSE reduction compared with pure simulation—and enables interpretable extraction of dynamical corrections (Bao et al., 1 Dec 2025). The approach generalizes to a variety of physical systems and sensor modalities, providing a scalable, computationally efficient framework for digital-twin deployment, model correction, and high-fidelity state reconstruction.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Data Assimilation with a SHallow REcurrent Decoder (DA-SHRED).