Papers
Topics
Authors
Recent
2000 character limit reached

gDDIM: Generalized Denoising Diffusion Models

Updated 12 February 2026
  • gDDIM is a generalized framework for denoising diffusion models that adapts integration schemes to arbitrary linear processes for accelerated sampling.
  • It employs a reparameterized score function and exponential integrators to produce deterministic or controlled stochastic sample paths.
  • gDDIM preserves forward process marginals while offering a tunable trade-off between sample diversity and deterministic quality in generative modeling.

Generalized Denoising Diffusion Implicit Models (gDDIM) are a flexible class of accelerated generative samplers extending the Denoising Diffusion Implicit Model (DDIM) framework to cover arbitrary linear, continuous-time diffusion processes. While DDIMs yield deterministic or low-stochasticity sample trajectories for isotropic (homogeneous) diffusions, gDDIM enables exact or approximate fast sampling for general, including non-isotropic, diffusions by adapting the parameterization and numerical integration schemes. This approach achieves accelerated high-fidelity generative modeling in settings where traditional DDIM methods are inapplicable, such as blurring diffusion and critically damped Langevin systems, while also providing a principled trade-off between diversity and deterministic sample quality (Zhang et al., 2022, Han, 2024, Sheng et al., 12 Oct 2025).

1. Core Principles and Problem Setting

A denoising diffusion model consists of a forward noising process—typically a stochastic differential equation (SDE) dxt=ft(xt)dt+gtdWtdx_t = f_t(x_t) dt + g_t dW_t that transforms data x0x_0 into highly noisy samples xTx_T (e.g., standard normal). The generative procedure requires simulating a reverse (typically intractable) SDE that starts from noise and recovers data. DDIM accelerated inference for isotropic models by replacing the stochastic reverse process with a deterministic ODE (“probability-flow ODE”) and providing an exact one-step integration in certain conditions. However, many practical diffusions—including blurring, coupled, or non-diagonal noise models—do not satisfy the isotropy requirement. gDDIM generalizes the DDIM method to arbitrary linear diffusion models by means of a diffusion-aware score parameterization and integration scheme, enabling implicit, non-stochastic, or controlled-stochastic sample paths compatible with the physical marginals of the forward process (Zhang et al., 2022, Han, 2024).

2. Theoretical Framework and Generalization

The forward SDEs for arbitrary linear DMs can be summarized as:

dxt=ft(xt)dt+gtdWtdx_t = f_t(x_t) dt + g_t dW_t

where ftf_t and gtg_t are, in general, time-dependent and possibly non-diagonal. The corresponding probability-flow ODE for sample generation is:

dxt=[ft(xt)12gt2sθ(xt,t)]dtdx_t = [f_t(x_t) - \tfrac{1}{2}g_t^2 s_\theta(x_t, t)] dt

with sθ(xt,t)s_\theta(x_t, t) denoting a neural approximation of the score function xlogpt(x)\nabla_x \log p_t(x). For general (non-isotropic) gtg_t, gDDIM introduces a matrix RtR_t satisfying RtRtT=ΣtR_t R_t^T = \Sigma_t (the marginal covariance at time tt), reparameterizes the score network as sθ(x,t)=RtTϵθ(x,t)s_\theta(x, t) = -R_t^T \epsilon_\theta(x, t), and implements implicit integration via an exponential integrator. For deterministic sampling, the update

xtΔ=Ψ(tΔ,t)xt+[ttΔ12Ψ(tΔ,τ)gτ2RτTdτ]ϵθ(xt,t)x_{t-\Delta} = \Psi(t-\Delta,t) x_t + \left[ \int_{t}^{t-\Delta} \tfrac{1}{2} \Psi(t-\Delta,\tau) g_\tau^2 R_\tau^{-T} d\tau \right] \epsilon_\theta(x_t,t)

replaces the isotropic DDIM update, where Ψ\Psi is the ODE transition matrix and all terms follow from the linear SDE structure. For stochastic sampling, an additional noise term parameterized by λ\lambda enables continuous interpolation between deterministic (ODE) and stochastic (SDE/ancestral) regimes. This construction supports preservation of the forward process marginals for arbitrary noise levels and model structures (Zhang et al., 2022, Han, 2024, Sheng et al., 12 Oct 2025).

3. Algorithmic Implementation

gDDIM sampling is formulated as an explicit multi-step predictor–corrector exponential integrator. The steps are:

  1. Precompute transition matrices Ψ(ti1,ti)\Psi(t_{i-1}, t_i) and noise-scaling matrices RtiR_{t_i} for all time steps.
  2. Execute, for each reverse timestep titi1t_i \to t_{i-1}:
    • Predictor: Produce a preliminary xti1x_{t_{i-1}} using a polynomial fit over past ϵθ\epsilon_\theta evaluations weighted by precomputed integrals.
    • Corrector: Refine xti1x_{t_{i-1}} using time-interpolated ϵθ\epsilon_\theta values.
    • For stochastic variants, inject noise with scale matched to the desired stochasticity parameter (λ\lambda or η\eta).
  3. Repeat until reaching x0x_0.

For standard score-based models:

1
2
3
4
5
x_T ~ N(0, Σ_T)
for i in N1:
    x_hat = Ψ(t_{i1}, t_i) x_{t_i} + predictor_terms
    x_{t_{i1}} = Ψ(t_{i1}, t_i) x_{t_i} + corrector_terms
return x_0
The stochasticity parameter (e.g., η\eta or λ\lambda) can be scheduled or fixed, with η=0\eta=0 yielding DDIM, η=1\eta=1 recovering DDPM, and η>1\eta>1 providing “super-stochastic” paths (Sheng et al., 12 Oct 2025).

4. Marginal Preservation, Variance Control, and Diversity-Speed Trade-off

A principal property of gDDIM, formalized in (Sheng et al., 12 Oct 2025, Han, 2024), is the preservation of marginals for any value of the stochasticity parameter. For each reverse step, the transition kernel is constructed to ensure that the distribution of xtΔx_{t-\Delta} given x0x_0 matches the corresponding forward marginal, allowing for controlled stochasticity without introducing bias. The stochasticity parameter provides an explicit, tunable trade-off: increasing it enhances exploration and sample diversity at the cost of speed and determinism; decreasing it yields faster, high-fidelity, less-diverse samples.

In RLHF-driven fine-tuning applications, the “reward gap” between samples generated by stochastic (SDE) and deterministic (ODE/DDIM) samplers is theoretically bounded and empirically converges to zero as the number of denoising steps increases. For Gaussian Variance Exploding (VE) and Variance Preserving (VP) models, analytic expressions show the gap vanishes as TT \to \infty, supporting the common ODE-inference practice after stochastic fine-tuning (Sheng et al., 12 Oct 2025).

5. Extensions: Mixture Kernels and Principal-Axis Schemes

Recent work further generalizes gDDIM by introducing mixture-of-Gaussian reverse kernels (GMM-gDDIM) (Gabbur, 2023) and principal-axis DDIM (paDDIM) (Han, 2024):

  • In GMM-gDDIM, the reverse transition is modeled as a mixture wkN(x;μk,Σk)\sum w_k \mathcal{N}(x; \mu_k, \Sigma_k), constrained to exactly match first and second moments of the DDPM marginals for enhanced performance in fast (few-step) settings.
  • paDDIM decomposes the diffusion operator along individual principal axes of the data covariance, allowing adaptive step sizes and noise allocation along each direction.

Empirical studies demonstrate that mixture-based gDDIM achieves lower FID and higher IS at minimal computational cost increase for small KK (e.g., K=2,4K=2,4), while principal-axis scheduling can further accelerate convergence and fine-tune fidelity-diversity balance when the data distribution is low-rank (Gabbur, 2023, Han, 2024).

6. Empirical Results and Practical Considerations

Validated on non-isotropic models such as Blurring Diffusion Models (BDM) and Critically Damped Langevin Diffusion (CLD) on CIFAR-10, deterministic gDDIM achieves an FID of 2.49 with only 50 steps (ca. 20× speedup) for BDM, and FID of 2.26/2.86 with 50/27 steps (40–80× reduction in score function evaluations) for CLD, matching or surpassing high-step-count stochastic and ODE integrators. Network architectures generally follow adaptive UNet backbones with standard normalization and ResBlocks; all key ODE coefficients are precomputed for efficient GPU implementation (Zhang et al., 2022). The overall wall-clock time increases moderately (linearly in mixture size for GMM-gDDIM), but the sample quality improvements are substantial for small step regimes (Gabbur, 2023).

7. Applications, Limitations, and Future Directions

gDDIM provides a uniform framework for fast, high-quality sampling in general diffusion models, admitting user-controlled tuning of sample diversity and determinism, and preserving statistical consistency in both discriminative (RLHF) and classic generative modeling domains (Zhang et al., 2022, Sheng et al., 12 Oct 2025). Limitations arise in high-dimensional scenarios requiring expensive matrix factorization or regression in the mixture or principal-axis extensions, and in integrating auxiliary guidance signals (e.g., classifier-free guidance) without incurring perceptible computational overhead (Shah et al., 2024). Promising future directions include hybrid stochastic-deterministic schedulers, distilled implicit guidance networks, and adaptive variance allocation along principal data modes (paDDIM) (Han, 2024). The framework remains directly extensible to non-equilibrium settings and admits principled application to new physically-motivated or structured diffusion models.


References

  • "gDDIM: Generalized denoising diffusion implicit models" (Zhang et al., 2022)
  • "DDIM Redux: Mathematical Foundation and Some Extension" (Han, 2024)
  • "Understanding Sampler Stochasticity in Training Diffusion Models for RLHF" (Sheng et al., 12 Oct 2025)
  • "Improved DDIM Sampling with Moment Matching Gaussian Mixtures" (Gabbur, 2023)
  • "Enhancing Diffusion Models for High-Quality Image Generation" (Shah et al., 2024)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to gDDIM.