Latent Diffusion-Based Differentiable Inversion
- The paper introduces LD-DIM, which couples pretrained latent diffusion priors with fully differentiable PDE solvers to robustly reconstruct high-dimensional parameter fields.
- It leverages an encoder-decoder architecture and adjoint-based gradient computation to optimize in a low-dimensional latent space, reducing ill-conditioning.
- Empirical evaluations demonstrate that LD-DIM outperforms PINNs and VAE-based methods, achieving up to 30× error reduction and superior SSIM scores.
The latent diffusion-based differentiable inversion method (LD-DIM) is a framework for solving high-dimensional inverse problems, particularly those governed by partial differential equations (PDEs), by coupling pretrained latent diffusion priors with fully differentiable numerical solvers. LD-DIM achieves stable and accurate reconstruction of spatially heterogeneous parameter fields from sparse observations through gradient-based optimization performed directly in the low-dimensional latent space of a diffusion model. Incorporating adjoint-based gradient computation, end-to-end automatic differentiation, and physics-based solvers, LD-DIM achieves significant improvements in conditioning, accuracy, and robustness, outperforming prevalent alternatives such as physics-informed neural networks (PINNs) and physics-embedded variational autoencoders (VAEs) for diverse inverse modeling tasks (Lin et al., 27 Dec 2025, Rout et al., 2023, Wang et al., 2024, Bai et al., 2024).
1. Latent Diffusion Priors and Model Architecture
LD-DIM relies on a latent diffusion model (LDM) that models the distribution of high-dimensional parameter fields in a low-dimensional latent space. The primary components are:
- Encoder : A convolutional variational autoencoder (VAE) encoder maps the high-dimensional parameter field (e.g., conductivity ) to a latent Gaussian posterior , with .
- Decoder : A convolutional decoder reconstructs the field, , ensuring that all fields lie on the learned manifold .
- Score Network : A U-Net in latent space, trained for denoising score matching to approximate at each timestep .
The training objective comprises a VAE loss for reconstruction and posterior regularization,
and a diffusion loss,
combined to define the variational bound on the data log-likelihood,
This latent prior captures complex variability and, crucially, preserves interfaces and sharp features, addressing the sharpness deficit of pixel-based generative approaches (Lin et al., 27 Dec 2025, Rout et al., 2023).
2. Differentiable PDE Solvers and Adjoint Techniques
Inverse modeling in LD-DIM is performed by coupling the LDM with a fully differentiable forward solver for the governing PDE. For example, in subsurface flow scenarios:
- Governing equation:
(Darcy flow).
- Numerical discretization: A finite-volume (FVM) scheme with two-point flux approximation results in a sparse linear system where is assembled and solved using efficient sparse solvers (e.g. in JAX).
- Adjoint state computation: Sensitivities are backpropagated via the discrete adjoint system
allowing efficient construction of gradients for objectives involving both PDE misfit and diffusion priors (Lin et al., 27 Dec 2025, Wang et al., 2024).
Automatic differentiation frameworks (JAX) are employed for propagating gradients through the decoder and the PDE solver, with custom vector-Jacobian product (VJP) implementations to avoid storage of large dense Jacobians and reuse the linear system in the reverse pass.
3. Inversion and Optimization in the Latent Space
The entire inversion procedure operates in latent space, drastically reducing optimization dimensionality and conditioning pathologies common in pixel or coefficient space:
- Objective function: For data observed at indices ,
with denoting the predicted PDE solution decoded from latent .
- Gradient computation: Gradients with respect to latent are computed efficiently using the chain rule through the decoder and solver; adjoint methods further improve computational efficiency.
- Optimization: Latent codes are iteratively updated via Adam or SGD; typical convergence is obtained in a few hundred steps, starting from (Lin et al., 27 Dec 2025).
By restricting search to the non-linear manifold of plausible fields learned by the LDM, the method implicitly regularizes the problem, suppresses ill-conditioned modes, and enables the accurate resolution of dominant and sharp structures.
4. Comparative Performance and Empirical Evaluation
LD-DIM has been empirically evaluated on representative PDE-constrained inverse problems, including:
- Random field and bimaterial subsurface conductivity reconstruction: Outperforms PINNs and VAE-based methods in both error and the structural similarity index (SSIM), e.g., reducing by up to and raising SSIM from near zero to over 0.9 compared to PINNs.
- Sensitivity and seed robustness: Multiple optimization runs from different seeds yield consistent reconstructions with only minor shifts at sharp interfaces, demonstrating robustness to initialization.
- Observation density: As the number of observed data points increases (from to in grid settings), reconstruction accuracy and variance improve, but even with very sparse observations LD-DIM captures large-scale structure (Lin et al., 27 Dec 2025).
Similarly, for canonical linear inverse tasks (inpainting, super-resolution), latent-space LD-DIM posterior sampling outperforms pixel-space diffusion approaches (DPS, DDRM) across PSNR, SSIM, and LPIPS (Rout et al., 2023).
5. Extensions and Applications in Physics-Constrained and Blind Inverse Problems
LD-DIM generalizes to various modalities and physical constraints:
- Joint generative latent spaces for multimodal PDE problems: In full waveform inversion (FWI), a shared latent codebook is used for both seismic and velocity fields, with the diffusion process enforcing consistency with the governing PDE. The learned prior acts as an implicit physics regularizer, and denoising steps refine solutions toward physical feasibility (Wang et al., 2024).
- Blind inverse problems (unknown forward operator): LD-DIM is extended to alternating EM frameworks. The E-step samples from the latent posterior using prior and likelihood gradients; the M-step updates forward operator parameters. Annealing and skip-gradient acceleration techniques improve robustness and efficiency in the EM loop (Bai et al., 2024).
LD-DIM, thus, unifies Bayesian inference with deep generative priors and physics-based simulation, enabling direct integration of uncertainty quantification, Bayesian posterior sampling, and full probabilistic inversion, as well as scalability to 3D and time-dependent problems.
6. Algorithmic Workflow and Computational Considerations
A pseudocode outlining the LD-DIM workflow is as follows (Lin et al., 27 Dec 2025, Rout et al., 2023):
| Step | Operation | Purpose |
|---|---|---|
| 1. Train VAE | Pretrain on field dataset via | Learn low-dimensional manifold |
| 2. Train LDM | Fix VAE, train latent-space U-Net with | Model prior over plausible fields |
| 3. Initialize | Random initialization | |
| 4. Optimization loop | Decode ; Solve ; Compute , ; Compute adjoint ; Update | Minimize loss in latent space |
| 5. Final reconstruction | Output | Recovered field |
Computationally, VAE and diffusion model pretraining requires moderate GPU time (hours for typical datasets); each inversion involves gradient steps, with per-step cost dominated by the sparse PDE solve and decoder pass. The method is inherently scalable and compatible with modern autodiff frameworks (Lin et al., 27 Dec 2025).
7. Limitations, Practical Issues, and Future Directions
While LD-DIM exhibits robust numerical conditioning and empirical performance, certain limitations persist:
- Data sparsity: Large-scale structure is reconstructable from minimal measurements, but recovery of fine-scale features necessitates higher observation density ( grid points) (Lin et al., 27 Dec 2025).
- Computational overhead: Cost arises mainly from the PDE solver and diffusion prior evaluations, but is manageable for moderate .
- Extensions: Natural generalizations include application to time-dependent and multiphase PDEs, fully 3D domains, and integration with explicit noise models or Bayesian sampling in the latent posterior (Lin et al., 27 Dec 2025, Wang et al., 2024, Bai et al., 2024).
Recent studies suggest that latent diffusion priors not only regularize ill-posed inverse problems but, when coupled with proper physical constraints, recover solutions that preserve fine structure and physical realism, with credible uncertainty quantification (Rout et al., 2023, Wang et al., 2024). The framework offers a principled and scalable toolset for high-dimensional scientific inversion under data scarcity and modeling uncertainty.