DiffMD: Denoising Diffusion Models
- DiffMD are generative models that iteratively remove known noise from input data, yielding robust representation learning and high-quality sample generation.
- They couple a fixed forward Gaussian noising process with a parameterized reverse Markov process using neural networks to stably learn uncertainty and data distributions.
- DiffMD extend to diverse domains such as fluid dynamics, imaging, and molecular simulations, offering benefits like improved quantification and flexible conditioning.
Denoising Diffusion Models (DiffMD) are a family of generative and predictive models based on iterative, probabilistically-grounded denoising. These models stochastically perturb an input (e.g., a clean image, fluid field, or molecular structure) using a known (often Gaussian) noise process, and train neural networks to reverse the degradation, thereby learning a rich representation of the underlying data manifold. Such models have established new standards for sample quality, uncertainty quantification, and application flexibility across a growing array of scientific and engineering domains.
1. Core Mathematical Formulation
Denoising diffusion models construct generative models by coupling a parameterized reverse Markov process with a fixed forward noising process. Given observed data , the forward chain is typically defined as:
- Forward (noising) process:
with a cumulative schedule , , resulting in the closed-form marginal
as utilized by FluidDiff for spatiotemporal prediction (Yang et al., 2023), and in most canonical image models.
- Reverse (denoising) process:
where is predicted via a neural network from the noisy state, diffusion step, and optional context (e.g., initial condition, time, or conditioning variables). The canonical mean is related to noise-prediction:
as in (Yang et al., 2023).
DiffMDs are trained via simplified evidence lower bound (ELBO) objectives that, for Gaussian cases, reduce to
A variety of architectures (e.g., U-Nets, transformer-augmented U-Nets) and noise schedules are used depending on application (Yang et al., 2023, Zhang et al., 2023, Permenter et al., 2023).
2. Sampling and Inference Algorithms
DiffMD sampling inverts the forward process using either stochastic (Langevin-style) or deterministic (DDIM-style) updates:
- Stochastic (Ancestral) sampling: Iteratively samples 0 from the learned posterior given 1 and random noise, as in
2
for 3, 4 specified by the scheduler (Yang et al., 2023).
5
and can be recast as a discretized ODE, as in (Zhang et al., 2023). This is further enhanced by the quarter-circular reparameterization, which improves numeric stability and enables high-order solvers.
- Adaptive and hybrid strategies: Techniques such as dual-output heads (simultaneously estimating both signal and noise), dynamic gating, and realignment over reverse steps have been shown to increase sampling efficiency and fidelity (see (Zhang et al., 2023, Benny et al., 2022, Manujith et al., 28 Jan 2026)).
3. Extensions to Complex Domains and Advanced Variants
Denoising diffusion frameworks have been extended well beyond the conventional Euclidean, image-based generative setting:
- Physical fields and PDEs: FluidDiff predicts nonlinear fluid fields from high-dimensional simulation data, learning the conditional dynamics without explicit physics priors. Its neural architecture uses U-Net blocks enhanced with time embeddings, self-attention, and explicit conditioning to forecast flow states. The model outperformed non-physics-informed neural baselines in short-term velocity-field prediction and generalized well to unseen initial conditions (Yang et al., 2023).
- Post-training quantization: AccuQuant addresses quantization in diffusion models by simulating error accumulation over multiple denoising steps and introducing grouped-step calibration, reducing memory complexity from 6 to 7. It achieves state-of-the-art FID-to-full-precision scores under low-bit quantization by explicitly aligning denoiser output distributions over multi-step groups (Lee et al., 23 Oct 2025).
- Optimization and projection perspectives: Diffusion model denoisers can be interpreted as projection operators under the manifold hypothesis, with deterministic sampling resembling inexact gradient descent on the squared distance to the data manifold. Two-point gradient estimation samplers exploit this, yielding significant FID gains at low step counts relative to DDIM and related fast samplers (Permenter et al., 2023).
- Domain-specific modifications: Linear interpolation between clean and real noisy images replaces classical Gaussian forward noising for robust real-world denoising (Yang et al., 2023). Patch masking (Masked Diffusion) replaces additive noise for self-supervised representation learning, enhancing downstream performance in segmentation tasks (Pan et al., 2023).
- Parameterization and stability improvements: Quarter-circular reparameterization (using 8) eliminates endpoint singularities and facilitates the deployment of high-order ODE solvers for faster and more stable sampling (Zhang et al., 2023).
- Inference-time realignment: DeRaDiff provides a mechanism for continuous control of preference/KL-regularization strength during sampling via geometric mixtures of per-step posteriors, removing the need for multiple retrainings for hyperparameter sweeps (Manujith et al., 28 Jan 2026).
4. Architectural and Algorithmic Considerations
Architectures for DiffMD are highly domain-adapted:
- U-Net-based backbones: FluidDiff employs a four-scale U-Net with residual, group norm, SiLU activation, and Transformer-style self-attention in block structure. Time is encoded via sinusoidal positional embeddings projected through MLPs (Yang et al., 2023).
- Expanded conditioning: Inputs may include not only the noised signal but also physically meaningful conditioning maps (e.g., initial field, target time, auxiliary feature buffers).
- Advanced embedding and ensembling: To permit real data with arbitrary noise models as input, methods like DMID introduce adaptive embeddings (e.g., VAE to match real noise to AWGN) and adaptive ensembling to balance perceptual quality and distortion (Li et al., 2023).
- Self-supervised and masked objectives: Training losses extend beyond MSE to robust alternatives such as the Charbonnier loss or structural similarity index (SSIM), especially in regimes where fine structural recovery is necessary (Yang et al., 2023, Pan et al., 2023).
- Hybrid output heads: Dual- (or multi-) output heads are deployed to predict both signal and noise for improved stability within the reverse chain, especially when using ODE-style inference (Zhang et al., 2023).
5. Evaluation, Empirical Results, and Applications
DiffMD have achieved strong empirical results across multiple benchmarks and tasks:
- FluidDiff for CFD: On fluid velocity-field prediction, FluidDiff achieved MAE = 0.1975 and RMSE = 0.3137, outperforming cGAN and pure U-Net models in short-term prediction and generalization (Yang et al., 2023).
- Quantized models: AccuQuant reduced FID2FP32 from 35.2 to 3.3 on CIFAR-10 at 6/6-bit and from 14.4 to 11.0 on text-to-image generation settings (Lee et al., 23 Oct 2025).
- Fast high-fidelity samplers: Gradient-estimation sampler achieved FID of 3.9 on CIFAR-10 with 10 steps (vs. DDIM 16.9), and 4.3 on CelebA (vs. DDIM 18.1) (Permenter et al., 2023).
- Robust real-world denoising: Linear-interpolation-based diffusion and SSIM/Charbonnier-trained models, even on simple CNN U-Nets, rivaled or exceeded Transformer architectures across SIDD and DND benchmarks, with PSNR/SSIM competitive with strong SOTA (Yang et al., 2023).
- Medical and scientific domains: Self-supervised DiffMD (e.g., DDM²) for diffusion MRI restored high-frequency anatomical detail and achieved +3.2 SNR gain over Patch2Self, operating with as few as n=1–2 prior volumes (Xiang et al., 2023).
- Accelerated computation: DiffMD sampling typically requires 9 network evaluations per frame (FluidDiff with 0); while slower than one-shot models, inference is still orders of magnitude cheaper than full PDE/CDF solvers.
6. Benefits, Limitations, and Prospective Developments
Benefits:
- Training is highly stable without adversarial saddle-points or mode collapse (Yang et al., 2023).
- Quantitative uncertainty is intrinsic to the posterior modeling (Yang et al., 2023, Permenter et al., 2023, Pan et al., 2023).
- Model architectures and conditioning mechanisms are modular and flexibly adapted to diverse domains, including fluids, imaging, molecular dynamics, and beyond.
Limitations:
- Inference speed is limited by the need for hundreds of network passes unless replaced by distilled, ODE, or hybrid samplers (Yang et al., 2023, Zhang et al., 2023).
- Long-horizon predictions degrade in accuracy due to the lack of physical constraints and compounding errors, motivating the integration of physics-informed projections or operators (Yang et al., 2023).
- Direct applicability to real-world or arbitrary noise, or different data manifolds, sometimes requires adaptation of the forward process or advanced embedding, e.g., noise modeling, masking, or VAE-projected inputs (Yang et al., 2023, Li et al., 2023).
Prospective directions:
- Physics-informed guidance and projections for constrained domains, e.g., divergence-free projection in fluid simulation (Yang et al., 2023).
- Faster or more efficient sampling via DDIM, high-order ODE solvers, and advanced quantization (Zhang et al., 2023, Lee et al., 23 Oct 2025).
- Application to multi-scale and spatiotemporal PDE systems, spectral operators, and larger-scale molecular or physical ensembles (Yang et al., 2023, Wu et al., 2022).
- Unification and principled generalization to arbitrary state spaces (Markov, continuous, discrete, manifold) within the denoising diffusion/score-matching paradigm (Benton et al., 2022).
References:
- "A Denoising Diffusion Model for Fluid Field Prediction" (Yang et al., 2023)
- "AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models" (Lee et al., 23 Oct 2025)
- "Interpreting and Improving Diffusion Models from an Optimization Perspective" (Permenter et al., 2023)
- "Real-World Denoising via Diffusion Model" (Yang et al., 2023)
- "Masked Diffusion as Self-supervised Representation Learner" (Pan et al., 2023)
- "Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise" (Zhang et al., 2023)
- "DeRaDiff: Denoising Time Realignment of Diffusion Models" (Manujith et al., 28 Jan 2026)
- "Denoising Monte Carlo Renders with Diffusion Models" (Vavilala et al., 2024)
- "DiffMD: A Geometric Diffusion Model for Molecular Dynamics Simulations" (Wu et al., 2022)
- "DDM²: Self-Supervised Diffusion MRI Denoising with Generative Diffusion Models" (Xiang et al., 2023)
- "To smooth a cloud or to pin it down: Guarantees and Insights on Score Matching in Denoising Diffusion Models" (Vargas et al., 2023)
- "Denoising Diffusion Samplers" (Vargas et al., 2023)
- "Stimulating Diffusion Model for Image Denoising via Adaptive Embedding and Ensembling" (Li et al., 2023)
- "From Denoising Diffusions to Denoising Markov Models" (Benton et al., 2022)
- "Diffusion Model for Generative Image Denoising" (Xie et al., 2023)
- "Dynamic Dual-Output Diffusion Models" (Benny et al., 2022)