Preconditioned Diffusion Sampling (PDS)
- Preconditioned Diffusion Sampling (PDS) is a technique that embeds explicit preconditioners into diffusion-based generative models to overcome inefficiencies in high-dimensional or anisotropic settings.
- It modifies standard Langevin or reverse diffusion updates by incorporating data-derived positive definite matrices into drift and noise terms, thereby adapting step sizes to local geometries.
- Empirical results demonstrate that PDS achieves significant acceleration, robust convergence, and maintained sample quality in applications like image generation, medical imaging, and image restoration.
Preconditioned Diffusion Sampling (PDS) denotes a family of techniques for accelerating and stabilizing diffusion-based generative models and posterior samplers by introducing explicit preconditioners into the drift and diffusion terms of the underlying stochastic processes. The primary aim of PDS is to address the ill-conditioning and inefficiency of conventional isotropic diffusion samplers, particularly in high-dimensional, anisotropic, or inverse problems, while preserving the theoretical invariance and sample quality of the original models. Mechanistically, PDS modifies Langevin or reverse diffusion updates by embedding a positive definite matrix—typically derived from data statistics, a known operator, or local curvature approximations—into both gradient and noise pathways, thus rescaling step sizes along different coordinates according to their effective statistical or geometric scales (Bhattacharya et al., 2023, Ma et al., 2023, Ma et al., 2022, Garber et al., 2023, Blumenthal et al., 5 Dec 2025).
1. Theoretical Foundations and Motivation
Score-based generative models (SGMs) and denoising diffusion probabilistic models (DDPMs) simulate high-dimensional data distributions via a stochastic sequence of noise injection and denoising operations, which are commonly mapped to discretized stochastic differential equations (SDEs) or Langevin dynamics. In the standard (isotropic) case, each sampling step applies a global step size and isotropic Gaussian noise, but the complex correlation structure of natural data or inverse problems (e.g., MRI, deblurring) renders this inefficient: gradient directions can have widely disparate curvatures, making a fixed step size simultaneously too small for "flat" directions and too large (unstable) for "steep" directions (Ma et al., 2023, Ma et al., 2022). This phenomenon—ill-conditioning—limits acceleration and can degrade sample quality when naively reducing the number of iterations.
PDS resolves this by introducing a preconditioning matrix that "whitens" the target density or adapts each update to the local geometry, allowing the sampling process to take large, coordinated steps in flat directions and cautious steps where the curvature is high, thus drastically improving convergence rates. The continuous-time preconditioned diffusion is defined as: where is minus the log-density of the target distribution and is a symmetric positive definite matrix, either constant or state-dependent (Bhattacharya et al., 2023).
2. Algorithmic and Mathematical Formulations
Discrete-time PDS is typically realized via a preconditioned Euler–Maruyama scheme: In SGMs, this translates to: where (symmetric positive definite) acts as the preconditioner, and is the intermediate target density at timestep (Ma et al., 2023, Ma et al., 2022). The preconditioner may be constructed to match the global or local covariance of the data, the Hessian of the negative log-posterior, or via operators exploiting the domain's structure (e.g., FFT-based diagonalizations, block-diagonal approximations) (Blumenthal et al., 5 Dec 2025).
In constrained inverse problems (image restoration, MRI), the preconditioner can interpolate between guidance regimes (e.g., back-projection and least-squares), controlled by a timestep-dependent weight schedule: with modulating between fast-consistency enforcement and noise robustness (Garber et al., 2023).
3. Design of Preconditioners
Preconditioner selection is critical for sampler efficiency:
- Constant Preconditioning: , often set to an approximation of the inverse local Hessian near a mode, yielding rapid mixing when is nearly quadratic (Bhattacharya et al., 2023).
- Data-Driven Preconditioning: For image generation, is constructed by decomposing into frequency and pixel-space components, estimated from the training set's second-order statistics. For example, in (Ma et al., 2023, Ma et al., 2022):
- : diagonal (Fourier domain), entries set by the inverse (log-) average power spectrum.
- : diagonal (pixel domain), scaling by per-pixel variance.
- Operator-Based Preconditioning: In linear inverse problems, —the inverse of the data term plus diffusion scale—is optimal for flattening the spectrum, implemented efficiently via conjugate gradient or FFT (Blumenthal et al., 5 Dec 2025, Garber et al., 2023).
- Time-Varying/Adaptive: Schedules or allow the preconditioner to adapt along the diffusion process, e.g., favoring rapid constraint enforcement early and LS robustness later (Garber et al., 2023).
A key practical insight is that aggressive (strong) preconditioning must be annealed appropriately as the total number of sampling steps is reduced, often following a law as preconditioning strength , with error controlled by (Ma et al., 2023).
4. Theoretical Properties and Convergence Guarantees
Theoretical analysis establishes that PDS inherits geometric ergodicity and admits non-asymptotic Wasserstein distance bounds analogous to standard Langevin samplers, but with improved constants as the preconditioner better matches the local geometry of the target distribution (Bhattacharya et al., 2023). For spatially-invariant , the stationary law of the discrete Markov chain converges in total variation to a unique invariant measure, with the rate dictated by spectral bounds on and the problem curvature.
In the context of SGMs and preconditioned reverse diffusion, the steady state is preserved (i.e., sampling is unbiased)—the modified Fokker–Planck equation shows that inserting into the drift and diffusion terms leaves the final marginal invariant under mild regularity conditions (Ma et al., 2023, Ma et al., 2022). Optional Metropolis correction steps (MALA) further ensure convergence to the correct distribution in aggressive settings.
For guided sampling in linear inverse problems, PDS allows smooth interpolation between minimum-norm (BP), least squares (LS), and intermediary guidance, with provable bias–variance trade-offs. The variance and bias bounds are controlled by the schedule of the preconditioner, providing flexibility in handling noise robustness and fidelity to observed data (Garber et al., 2023).
5. Application Domains and Empirical Performance
PDS has been successfully applied across diverse domains:
- Image Generation: PDS delivers up to 28–29x acceleration in score-based generative models on high-resolution datasets (e.g., FFHQ ), with consistent improvements or maintenance of sample quality (e.g., FID 1.99 on CIFAR-10 for NCSN++, best among score-based and GAN baselines) for aggressive step reduction ( vs standard ) (Ma et al., 2023, Ma et al., 2022).
- Medical Image Segmentation: In PD-DDPM, pre-segmentation via a plug-in network accelerates inference (3.3x reduction in steps, vs ) and improves segmentation metrics (Dice, Jaccard, HD95, F1) compared to vanilla DDPM, without retraining or loss of uncertainty quantification (Guo et al., 2022).
- Image Restoration: Iteratively preconditioned guidance interpolates BP and LS to yield faster, more robust denoising and deblurring, with empirical improvements of 1–2 dB in PSNR and >100x speedup over DPS (Garber et al., 2023).
- MR Image Reconstruction: Preconditioned ULA-based posterior diffusion sampling halves computation time and improves PSNR/SSIM, maintaining stability across hardware and undersampling rates (Blumenthal et al., 5 Dec 2025).
Empirical studies highlight the insensitivity of PDS to hyperparameters within reasonable ranges, robustness across step counts via the relation, and resilience to both over- and under-conditioning.
6. Algorithmic Descriptions and Practical Considerations
PDS is implemented as a modification to the standard reverse diffusion or Langevin iteration:
- PDS Core Sampler:
- Apply preconditioner to both score drift and Gaussian noise injection.
- Optionally include Metropolis adjustment for large steps (Ma et al., 2023, Ma et al., 2022).
- Inverse and square-root operations are efficiently handled by FFT if the preconditioner is diagonal in frequency space; otherwise, conjugate gradient solvers are used in high-dimensional inverse settings (Blumenthal et al., 5 Dec 2025, Garber et al., 2023).
- Guided/Conditional PDS:
- In segmentation and restoration, the forward process may be initialized from a pre-segmentation or guided by data constraints via preconditioned fidelity steps.
- Schedules are constructed so that early diffusion (high noise) favors rapid correction or exploration, while late diffusion (low noise) focuses on fine-scale consistency.
- Empirical Recommendations:
- Preconditioner construction: fit from dataset statistics or derive from the application operator.
- Step size and schedule: follow baseline models; PDS enables scaling up by 2–4x.
- FFT/IFFT costs are negligible relative to the network call (main computational bottleneck).
- Metropolis correction is optional outside extreme acceleration regimes (empirically 100% acceptance for in high-resolution experiments) (Ma et al., 2022).
Typical pitfalls include miscalibrated preconditioning (over-suppression of high-frequencies or insufficient adaptation), but ablation studies demonstrate qualitative and quantitative robustness.
7. Domain-Specific Variations and Extensions
Specialized adaptations of PDS include:
- Pre-segmentation Diffusion Sampling (PD-DDPM): Employs forward diffusion on a coarse mask and accelerates denoising in segmentation, maintaining the Gaussian structure required for DDPM consistency (Guo et al., 2022).
- Iteratively Preconditioned Guidance: Modulates data fidelity in restoration, obviates the need for SVDs, and operates efficiently via CG/FFT (Garber et al., 2023).
- Preconditioned ULA for Posterior Sampling in MRI: Leverages the SENSE forward model for state-adaptive preconditioning, eliminating hand-tuning of step sizes and data annealing (Blumenthal et al., 5 Dec 2025).
Preconditioned sampling generalizes to both Markovian and non-Markovian, as well as state-dependent preconditioning frameworks, provided the technical conditions (e.g., bounded Lipschitz constants, positive definiteness) are satisfied (Bhattacharya et al., 2023).
In sum, Preconditioned Diffusion Sampling comprises a set of principled, theoretically justified modifications to stochastic generative and inverse-sampling algorithms. These techniques yield substantial acceleration, improved convergence, and robustness across a range of domains, without modifying core model architecture or requiring retraining, supported by strong empirical and mathematical guarantees (Ma et al., 2023, Ma et al., 2022, Bhattacharya et al., 2023, Garber et al., 2023, Guo et al., 2022, Blumenthal et al., 5 Dec 2025).