Learning Interpretable PDE Representations for Generative Reconstructions with Structured Sparsity

Published 26 Apr 2026 in cs.LG | (2604.23867v1)

Abstract: Scientific measurements are often bottlenecked by suboptimal conditions, whether that be noise, incomplete spatial coverage, or limited resolution, rendering accurate field reconstruction a difficult task. We introduce LatentPDE, a latent diffusion framework designed to simultaneously resolve sparse-observation reconstruction and super-resolution. While existing physics-guided diffusion models typically rely on soft loss penalties or uninterpretable representations, our approach enforces physical compliance by constructing an inherently interpretable latent space. Specifically, we parameterize the latent variables directly as the coefficients and source terms of an assumed governing PDE. In doing so, LatentPDE is able to reliably reconstruct dynamics across highly disparate and structured data gaps. Empirical results on diverse configurations demonstrate that our model achieves high-fidelity recovery at any desired resolution while also tracking the underlying predictive uncertainty.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces a novel framework that embeds PDE coefficients in the latent space of a diffusion model to enforce strict physical fidelity.
It employs a differentiable spectral decoder for super-resolution, achieving up to 73% RMSE improvement over baseline methods like PINNs.
The method provides calibrated uncertainty quantification and robust inference across varied missingness regimes, ensuring physically plausible reconstructions.

LatentPDE: Interpretable PDE-Latent Diffusion for Generative Reconstruction under Structured Sparsity

Problem Context and Motivation

The reconstruction of spatiotemporal fields governed by partial differential equations (PDEs) from sparse, noisy, or low-resolution measurements is central in multiple scientific domains, such as climate modeling and structural health monitoring. Existing approaches—including traditional data assimilation (DA), Bayesian inverse solvers, and recent operator learning models—face critical limitations: they either scale poorly in high-dimensional settings, lack principled uncertainty quantification, or treat governing physics as soft priors rather than hard constraints, frequently producing non-physical or overly smooth reconstructions. These shortcomings are particularly pronounced in ill-posed inverse problems, where observational coverage is both sparse and highly structured.

LatentPDE introduces a principled framework that enforces physical constraints by parameterizing the latent space of a generative diffusion model directly as the coefficients (and source terms) of the governing PDE, yielding strictly interpretable and physically plausible reconstructions with calibrated uncertainty. Crucially, the architecture is capable of super-resolution and robust inference across a range of structured and out-of-distribution (OOD) missingness regimes.

Methodology

LatentPDE departs from previous approaches by directly embedding PDE coefficients and source terms in the generative model's latent space, coupling this with a differentiable spectral decoder that strictly enforces the governing equations as hard constraints. This parameterization yields a latent generative manifold composed of interpretable physical variables, unlike prior operator-learning approaches in which latent dimensions lack physical meaning.

Latent Variable Parameterization: Each candidate reconstruction is represented by a latent vector containing PDE coefficients (e.g., advection velocities, diffusivity, mass, wave speed) and a truncated Fourier representation of a spatially-dependent source term, enabling band-limited interpolation across arbitrary output resolutions. This design allows the decoder—an FFT-based spectral solver—to reconstruct the entire physical field directly from the latent variables, ensuring strict compliance with physical laws at any target resolution.

Inference Pipeline: The model comprises:

An encoder that maps sparse, noisy, low-resolution observations and initial conditions to an initial latent estimate, either via maximum a posteriori (MAP) optimization (LatentPDE-MAP) or an amortized deep neural network (LatentPDE-ENC).
A denoising diffusion process operating in the normalized latent space, trained to learn a prior over physically consistent latents and to refine the encoder output, explicitly providing posterior uncertainty quantification.
A spectral decoder that, given the latent, reconstructs the high-resolution physical field in a strictly PDE-compliant manner via Fourier transforms.

Observation Conditioning: The input conditioning includes not just the masked observation field and binary sensor mask, but also spatial maps encoding distance-to-observation and observation density, informing the encoder about the spatial structure and severity of missingness.

Physics-Guided Posterior Sampling: During sampling, "physics guidance" (gradient steps on the masked field residual) is injected at each reverse diffusion step to further align the latent variables with observed data, improving data-consistency while maintaining physical plausibility.

Experimental Evaluation

LatentPDE is benchmarked against a diverse suite of baselines including 3D-Var, EnKF, PINNs, and FunDPS, across three canonical PDE families: advection-diffusion (parabolic), Klein-Gordon (hyperbolic), and Helmholtz (elliptic), each with multiple parameter regimes and structured missingness (random, clustered, grid, radial, single-patch masks).

Key Numerical Results:

Reconstruction Accuracy: Across all regimes and noise/sparsity levels, LatentPDE achieves the lowest pointwise RMSE, outperforming PINNs by up to 73% and EnKF by over 87% in certain regimes. Under sparse (5%) and noisy ( $\sigma=0.15$ ) observation scenarios, LatentPDE robustly reconstructs subgrid dynamics without spectral over-smoothing or spurious artifacts, unlike baselines.
Spectral Fidelity: LatentPDE consistently yields the lowest power spectral density (PSD) log error. Its reconstructions closely match the ground-truth spectral energy distributions, particularly retaining sharp high-frequency cutoffs corresponding to fine-scale dynamics, where baselines (especially PINNs and DA methods) exhibit spectral bias or loss of detail.
Uncertainty Quantification: The latent diffusion process provides explicit posterior ensembles, with standard deviation maps that localize predictive uncertainty to regions of sparse/ambiguous input, correlating with higher absolute error. This property is absent in deterministic or ensemble-based assimilation methods, which lack calibrated uncertainty estimates in non-Gaussian, ill-posed settings.
Super-Resolution and Robustness: The spectral forcing representation and spectral decoder ensure that reconstructions can be upsampled to any desired resolution, with consistent physical meaning and no need for high-resolution initial conditions—unlike competing methods.

The architecture is competitively efficient; amortized inference (LatentPDE-ENC) is substantially faster than diffusion-based function-space methods such as FunDPS, with more than an order of magnitude reduction in model parameters and a 64% reduction in inference time at high resolution.

Discussion and Implications

LatentPDE's core innovation lies in hard-enforcing physical constraints via interpretable latent parameterization, moving beyond the "soft regularizers" of most prior physics-informed or operator-learning architectures. This approach establishes a bounded and physically plausible generative forward model, enabling reliable field inference under severe sparsity, noise, and structural missingness. The explicit separation of physics compliance (decoder) and latent inference (encoder + diffusion) clarifies the contributions of each architectural component.

By encoding governing laws directly into the latent space and leveraging a diffusion model for uncertainty, LatentPDE unifies generative posterior distribution learning with strict physical interpretability—modeled as PDE coefficient spaces—granting both pointwise and global insight into reconstruction reliability. This is particularly impactful for decision-critical applications, where physical plausibility and uncertainty calibration are mandatory under data scarcity.

The comparison of MAP versus amortized encoder initializations highlights a valuable tradeoff: MAP aligns more closely with the data, while the encoder can inject training-distribution inductive bias—potentially improving generalization when the inverse problem is poorly constrained. Their relative performance is regime- and coverage-dependent.

Limitations and Future Directions:

Currently, the method is limited to linear, constant-coefficient, periodic PDEs; nonlinearity, non-periodic boundaries, and irregular domains remain to be addressed.
The spectral solver is intrinsically suited for smooth solutions; sharp discontinuities or shocks are not well captured.
Equifinality—multiple solutions yielding similar observations—remains a challenge; future directions include conditioning on temporal sequences and expanding decoder flexibility.

Conclusion

LatentPDE provides a demonstrably effective, physically rigorous, and interpretable generative solution to PDE-governed inverse problems under extreme data sparsity, outperforming state-of-the-art DA and deep learning baselines across multiple governing equations and mask distributions. Its integration of structured latent space, hard physics enforcement, latent diffusion for uncertainty quantification, and resolution-agnostic reconstruction marks a theoretically principled advance for physical field inference. Extension to nonlinear, non-periodic systems and further generalization of the decoder will be critical to broadening practical adoption in scientific and engineering domains.

Markdown Report Issue