Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation (2507.02608v3)

Published 3 Jul 2025 in cs.LG and physics.flu-dyn

Abstract: The steep computational cost of diffusion models at inference hinders their use as fast physics emulators. In the context of image and video generation, this computational drawback has been addressed by generating in the latent space of an autoencoder instead of the pixel space. In this work, we investigate whether a similar strategy can be effectively applied to the emulation of dynamical systems and at what cost. We find that the accuracy of latent-space emulation is surprisingly robust to a wide range of compression rates (up to 1000x). We also show that diffusion-based emulators are consistently more accurate than non-generative counterparts and compensate for uncertainty in their predictions with greater diversity. Finally, we cover practical design choices, spanning from architectures to optimizers, that we found critical to train latent-space emulators.

Summary

The paper shows that latent diffusion models with autoencoders effectively emulate dynamical systems even under high compression levels.
The methodology compresses high-dimensional physical states into a latent space, resulting in improved computational efficiency and robustness.
Experimental results demonstrate that the latent diffusion approach outperforms deterministic solvers by generating diverse and stable simulations.

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

The paper "Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation" (2507.02608) explores the potential of latent diffusion models (LDMs) for emulating dynamical systems. It demonstrates that such models, aided by autoencoders in their architecture, remain robust across a range of compression rates and usually outperform conventional non-generative methods in both accuracy and computational efficiency.

Core Methodology

The investigation revolves around using LDMs to emulate the dynamics of partial differential equations underlying various physical systems. The key approach involves compressing high-dimensional input data using autoencoders that map physical states into a lower-dimensional latent space, where diffusion models then operate.

Diffusion Models in Latent Spaces

Diffusion models are inherently generative, producing samples from complex distributions via SDEs defined over continuous time. This work adopts a specific noise schedule to implement rectified flow, which leverages $L_2$ norm estimates trained through denoising score matching. The primary innovation here is applying these diffusion models within latent spaces formed by autoencoding processes. This method is computationally beneficial since it reduces the dimensionality of data that the diffusion models need to handle.

Figure 1: An illustration of the latent-space emulation process showing diffusion models predicting subsequent latent states.

Experimental Setup

Three datasets, encompassing different fluid dynamics scenarios, are employed for training and assessment—Euler multi-quadrants, Rayleigh-Bénard convection, and Turbulence Gravity Cooling (TGC). The experimentation details latent compression's effect on emulation, where latent-space emulators were shown to perform markedly well, even at high levels of compression ( $\div = 1024$ ).

Results and Insights

Contrary to expectations, the paper finds that the autoencoder’s reconstruction quality significantly deteriorates with high compression rates, but this degradation does not necessarily translate to the quality of the dynamics emulated by the LDMs.

Robustness to Compression

The experimental findings indicate substantial resilience in latent-space diffusion models' emulation accuracy against increasing compression rates. This behavior suggests that LDMs can filter perceptually irrelevant data, retaining only the semantically significant information.

Figure 2: VRMSE comparisons showing how emulation accuracy stays robust across compression rates.

Comparative Performance

Latent diffusion models surpass deterministic solvers in generating more diverse and statistically plausible outcomes. This divergence highlights the potential of generative models to adjust better in scenarios with inherent uncertainties or where multiple possible outcomes exist.

Figure 3: Examples of latent-space emulation maintaining accuracy despite large compression rates.

Implications for Physics Emulation

This exploration asserts the feasibility of using latent spaces for physics emulation, especially when combined with generative model approaches like diffusion processes. The advantage lies not only in computational efficiency but also in improved long-term stability and accuracy over deterministic solvers.

Future Directions

Further research could focus on expanding the datasets to probe the generalization of these findings across broader physical phenomena. Additionally, refining LDM architectures with deeper latent insight and exploring temporal and spatial attention mechanisms may yield even more robust emulation frameworks.

Conclusion

The paper offers a compelling case for latent diffusion models in physics emulation, effectively demonstrating that these models can maintain high performance even under substantial compression. As such, the integration of LDMs presents a promising avenue for efficient and accurate simulations in scientific and engineering contexts.