Papers
Topics
Authors
Recent
Search
2000 character limit reached

Cold Diffusion: Deterministic Restoration Models

Updated 4 February 2026
  • Cold diffusion is a deterministic diffusion model that replaces random noise with degradation operators such as blur and masking to enable inversion and restoration.
  • It employs deep neural networks, often U-Net variants with time-step conditioning, to learn reverse operators that reconstruct original data from progressively degraded inputs.
  • The method delivers state-of-the-art restoration in domains like MRI, geophysics, and medical segmentation while addressing challenges in manifold preservation and high-frequency recovery.

Cold diffusion refers to a broad class of deterministic diffusion models, primarily found in image processing and scientific inverse problems, in which the conventional stochastic (noise-based) degradation of typical diffusion models is replaced with deterministic corruptions such as blurring, masking, or subsampling. In contrast to the classical "hot" diffusion paradigm, which relies on random Gaussian noise either for generative modeling or restoration tasks, cold diffusion frameworks invert arbitrary sequences of degradations—often derived from domain-specific priors or physical processes—without relying on injected stochasticity. This approach enables robust restoration, deconvolution, and generative sampling, and has yielded state-of-the-art results in applications where traditional noise-based methods are suboptimal or ill-posed.

1. Theoretical Foundations and Mathematical Formalism

The essential insight of cold diffusion is to generalize the diffusion model framework by replacing the forward, noise-based process with a family of deterministic degradation operators D(,t)D(\cdot, t). Given a data sample x0x_0 (e.g., an image or signal), a discrete or continuous sequence of increasingly severe transforms produces a path: xt=D(x0,t),t=0,1,,Tx_t = D(x_0, t), \qquad t=0,1,\dots,T where D(x0,0)=x0D(x_0,0) = x_0 and DD monotonically degrades information as tt increases. The nature of DD is dictated by the application: typical choices include convolutional blur, inpainting/masking, k-space subsampling (in MRI), or physically motivated propagators.

Unlike "hot" diffusion (e.g., DDPM, SDE, DDIM), where D(x0,t)D(x_0,t) is stochastic (e.g., D(x0,t)=αtx0+1αtϵD(x_0,t) = \sqrt{\alpha_t} x_0 + \sqrt{1-\alpha_t} \epsilon with ϵN(0,I)\epsilon \sim \mathcal N(0,I)), cold diffusion utilizes strictly deterministic mappings, so the forward chain is a sequence of Dirac-delta potentials: q(xtx0)=δ(xtD(x0,t))q(x_t \mid x_{0}) = \delta(x_t - D(x_0, t)) The inverse problem is then to reconstruct x0x_0 from xTx_T by learning an operator Rθ(xt,t)R_\theta(x_t, t), optimized to minimize the expected loss over random tt and samples: minθ Ex0, t[Rθ(D(x0,t),t)x01]\min_{\theta}~ \mathbb E_{x_0,~t}[ \| R_\theta(D(x_0,t), t) - x_0 \|_1 ] The "improved" reverse pass for sampling or restoration typically applies a bias-corrected update: xt1=xtD(Rθ(xt,t),t)+D(Rθ(xt,t),t1)x_{t-1} = x_t - D(R_\theta(x_t, t), t) + D(R_\theta(x_t, t), t-1) which compensates for model bias in RθR_\theta, and in linear degradations guarantees error-correction at each step (Bansal et al., 2022).

2. Model Architectures, Training, and Inference Procedures

In cold diffusion, the restoration operator RθR_\theta is usually a deep neural network (most frequently a U-Net variant) with explicit time-step conditioning. Training data is generated by synthetically degrading clean samples x0x_0 through D(,t)D(\cdot, t) for randomly chosen tt. The choice of reconstruction loss (1\ell_1 or 2\ell_2) is motivated by the task: 1\ell_1 is often preferred for deblurring or inpainting due to better peak signal-to-noise ratio (PSNR) properties.

Inference consists of either direct single-step restoration or iterative application of RθR_\theta and D(,t)D(\cdot, t) in decreasing tt, either via naive chaining or stabilized bias-correction. Example pseudocode for the backward process in k-space cold diffusion for MRI is:

1
2
3
4
for t = T, ..., 1:
    x0_hat = R_theta(x_t, t)
    x_{t-1} = x_t - D(x0_hat, t) + D(x0_hat, t-1)
output: x0_hat
The same principle holds in geophysical inverse problems (gravity downward continuation), robotic trajectory planning (projection onto the replay buffer), and segmentation tasks where space is reparameterized (e.g., surface-aware graph parameterizations for medical image masks).

3. Application Domains and Design of Deterministic Degradation Operators

Cold diffusion excels in problems where random Gaussian noise is a poor model of data loss or degradation. Notable application-specific D(,t)D(\cdot, t) include:

  • Upward/Downward Continuation in Geophysics: The forward operator D(V,h)D(V,h) is a height-parameterized blur in the Fourier domain with exponential kernel Pk(Δz)=exp(kΔz)P_k^{(\Delta z)} = \exp(-k\Delta z), modeling physical field propagation (Jain et al., 24 Oct 2025).
  • MRI Reconstruction: In k-space cold diffusion, Dt(x0)=F1(MtF(x0))D_t(x_0)=F^{-1}(M_t \odot F(x_0)), where MtM_t is a sampled Fourier mask at step tt (Shen et al., 2023).
  • Image Restoration and Generation: DD may encode Gaussian blurs, spatial masks for inpainting, deterministic pattern corruptions ("snowification"), or spectrum-specific filters (Bansal et al., 2022, Hsueh et al., 21 Nov 2025).
  • Trajectory Planning: Replay-buffer projection, where each state sis_i in a sequence is replaced by a randomly sampled buffer state within a prescribed radius, ensuring all interpolated plans are feasible (Wang et al., 2023).
  • Medical Image Segmentation: Surface cold-diffusion applies cyclic shifts and vertical perturbations to surface-parameterized segmentations, rather than direct pixel noise (Zaman et al., 2023).
  • Unsupervised Anomaly Detection: Synthetic anomaly generators compose binary masks, foreign patches, and intensity shifts to mimic plausible abnormalities without noise (Marimont et al., 2024).

This design flexibility permits adaptation to domain-specific priors, yielding interpretable and often physically meaningful degradation–restoration chains.

4. Theoretical Properties, Advantages, and Limitations

Theoretical properties of cold diffusion diverge sharply from noise-based ("hot") diffusion. The deterministic nature of D(,t)D(\cdot, t) ensures full path control, removing randomness and simplifying the learning problem—especially for invertible or pseudoinvertible forward maps. Empirical findings consistently show:

  • Robustness to structured and correlated noise, observed especially in field geophysics (gravity data with field-like noise, where cold diffusion outperforms U-Net baselines and matches oracle regularization) (Jain et al., 24 Oct 2025).
  • Universality across blur levels: A single RθR_\theta inverts all degradations in D(,t)D(\cdot, t), obviating per-example hyperparameter tuning.
  • Faster and more stable convergence in non-Gaussian scenarios (medical segmentation, anomaly detection, field inversion).
  • Preservation of feasible-state trajectories in control and planning tasks when used with replay buffers (Wang et al., 2023).

However, cold diffusion is subject to fundamental limitations:

  • Manifold degeneration and out-of-manifold artifacts: Pure blur transformations can collapse the data manifold, with heavy blurring removing high-frequency variability. The reverse process then becomes unstable—small restoration errors drive outputs off-manifold, yielding poor diversity and sample realism (Hsueh et al., 21 Nov 2025).
  • Performance drop in high-frequency recovery: In generative image synthesis, cold (pure-blur) diffusion yields FID ≈ 80.1 on CIFAR-10 vs. FID ≈ 1.97 for standard noise-based methods (Hsueh et al., 21 Nov 2025).
  • No explicit support expansion: Absent noise, deterministic chains cannot sample off the training data manifold, which may be undesirable for diverse generation.

Hybrid "warm diffusion" models have been proposed to blend blur (deterministic) and noise (random) degradations, enabling trade-off control via a blur-to-noise ratio (BNR), with empirical and spectral analysis identifying BNR ≈ 0.5 as optimal for simultaneous fidelity and diversity (Hsueh et al., 21 Nov 2025).

5. Quantitative Performance and Empirical Evaluation

Cold diffusion delivers strong quantitative results across tasks:

Task/Class Cold Diffusion Metric Baseline/Oracle Hot Diffusion Metric
Gravity Downward Continuation (100 m, synthetic) PSNR ≈ 61.2 dB (stabilized DC) (Jain et al., 24 Oct 2025) Oracle Tikhonov ≈ 36.6 dB U-Net ≈ 52.3 dB
MRI (fastMRI, 4×, cartesian) PSNR = 30.58 / SSIM = 0.7150 (Shen et al., 2023) E2E-VarNet 30.29 / 0.6850 U-Net 28.21 / 0.6001
Medical Segmentation (Echo, Dice) 0.940 ± 0.019 (Zaman et al., 2023) DeepLabV3+ 0.932 ± 0.028 U-Net 0.863 ± 0.090
CIFAR-10 Generation (FID, NFE=35) ≈ 80.1 (cold, pure blur) (Hsueh et al., 21 Nov 2025) EDM (noise): 1.97 BNMD (hybrid): 1.85

These results suggest that cold diffusion often substantially outperforms noise-based or classical CNN methods in restoration/inversion tasks when the forward degradation is physically interpretable and invertible, but underperforms for high-dimensional generative modeling if the random component is omitted.

6. Broader Implications and Adaptation to New Domains

Cold diffusion's generality allows its adoption in problems where the forward operator is a smoothly parameterized family, D(,t)=etLD(\cdot, t) = e^{-tL} (for suitable LL), subsampling schemes, adjacency-constrained state transitions, or domain-specific perturbations:

  • Theoretical flexibility: Cold diffusion demonstrates that stochasticity in diffusion models is not fundamental for successfully inverting degradations or even for generative sampling (Bansal et al., 2022).
  • Implementation guidelines: Identify a suitable forward operator DD, discretize its total effect, train a single RθR_\theta across all degradation levels, and use stabilized inversion at inference (Jain et al., 24 Oct 2025).
  • Applicability: The recipe applies directly to deconvolution, super-resolution, magnetic field inversion, anomaly detection, and planning, provided the degradation chain is invertible or nearly so (Jain et al., 24 Oct 2025, Wang et al., 2023, Marimont et al., 2024).

A plausible implication is that further research into invertibility, spectral analysis, and controlled blending with stochasticity (see "warm diffusion" (Hsueh et al., 21 Nov 2025)) will continue to expand the scope and effectiveness of such models.

7. Representative Variants and Extensions

Several extensions leverage the cold diffusion paradigm:

  • Hybrid models (warm diffusion) combining blur and noise for better manifold connectivity and fidelity (Hsueh et al., 21 Nov 2025).
  • Surface cold diffusion for structure-aware segmentation (1D surface parameterization of masks for rapid mixing and uncertainty quantification) (Zaman et al., 2023).
  • K-space cold diffusion for MRI acceleration, domain-guided by sampling masks directly in the Fourier domain (Shen et al., 2023).
  • Replay-buffer cold diffusion to ensure feasibility in robotic planning (Wang et al., 2023).
  • Ensembling of cold-diffusion restorations and disentangled anomaly generators in anomaly detection for both interpretability and sensitivity (Marimont et al., 2024).

These variants highlight the versatility of the cold diffusion concept when adapted to the data and task structure, supporting highly competitive or state-of-the-art results across restoration, segmentation, anomaly detection, and planning.


Key references:

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Cold Diffusion.