Papers
Topics
Authors
Recent
Search
2000 character limit reached

Zero-Shot Denoising Advances

Updated 13 January 2026
  • Zero-shot denoising is a self-supervised approach that recovers clean signals from noisy inputs by leveraging intrinsic statistical and structural properties.
  • Techniques such as blind-spot strategies, directional filtering, and diffusion models enable adaptation to both i.i.d. and structured noise.
  • Recent methods incorporate physics-informed strategies and adaptive loss designs to enhance performance in CT, FLIM, and ultrasound imaging.

Zero-shot denoising refers to the recovery of clean signals from degraded observations without access to external clean samples, large datasets, or explicit noise models. It encompasses a class of self-supervised or unsupervised learning strategies where a denoiser is trained or adapted directly on a single noisy instance, leveraging only intrinsic statistical, structural, or physical properties of the input and the noise. This paradigm is particularly powerful in scientific, medical, and industrial imaging domains where paired clean-noisy datasets are unavailable, and noise can be complex, structured, or domain-specific. Zero-shot denoising frameworks have undergone rapid evolution, transitioning from methods effective only for i.i.d. noise to adaptive architectures capable of handling structured, spatially correlated, or highly nonstationary corruptions, and now extend to high-resolution, multimodal, and video settings.

1. Foundational Assumptions and Classical Zero-Shot Paradigms

Traditional zero-shot denoising methods rely upon precise probabilistic assumptions on the noise η\eta contaminating an unknown clean signal xx, observed as y=x+ηy = x + \eta. The foundational assumption is that η\eta is independent and identically distributed (i.i.d.), typically modeled as zero-mean Gaussian: η∼N(0,σ2I)\eta \sim \mathcal{N}(0, \sigma^2 I). Under this independence property, one can construct pseudo-independent noisy pairs from a single image using spatial downsampling, masking, or blind-spot strategies. When two images have the same expected signal and independent noise realizations, the expected squared error between noisy-noisy pairs equates (up to a constant) to the error between clean and noisy, thus facilitating self-supervised Noise2Noise (N2N) regression objectives without clean targets (Mansour et al., 2023).

The canonical ZS-N2N model exemplifies this paradigm. From a single noisy image, deterministic downsampling or kernel-based operations generate two sub-images whose clean components coincide and whose noise is decorrelated, thus permitting training of a lightweight denoising network via a symmetric loss: Lres=12∥D1(y)−fθ(D1(y))−D2(y)∥22+12∥D2(y)−fθ(D2(y))−D1(y)∥22.\mathcal{L}_{\rm res} = \frac{1}{2} \|D_1(y) - f_\theta(D_1(y)) - D_2(y)\|_2^2 + \frac{1}{2} \|D_2(y) - f_\theta(D_2(y)) - D_1(y)\|_2^2. A consistency loss is included to enforce agreement between full- and reduced-resolution denoising. This design achieves state-of-the-art empirical performance for i.i.d. noise at a fraction of the computation cost of residual self-supervised methods and demonstrates robustness across synthetic and real-world noise scenarios (Mansour et al., 2023).

2. Methods for Structured, Correlated, and Domain-Specific Noise

Real-world imaging frequently violates the i.i.d. assumption, introducing structured (anisotropic or correlated) noise, e.g., banding in CT or striping in fluorescence microscopy. Classical self-supervised approaches fail in these regimes due to residual correlation between pseudo-independent pairs, resulting in artifact transfer or incomplete noise suppression.

The Median2Median (M2M) framework addresses this limitation by exploiting directional interpolation and generalized median filtering to isolate outlier values along dominant noise directions, thus constructing pseudo-independent pairs even under strong spatial correlation (Wang et al., 2 Oct 2025). The process involves:

  1. Directional interpolation at each pixel to assemble candidate estimates using both nearest neighbors and axis-aligned first-order interpolants.
  2. Generalized-median filtering to reject extremal estimates associated with structured artifacts.
  3. Random assignment of three central estimates to two views per pixel to eliminate systematic bias.
  4. Training with symmetric and consistency losses over all 3×33\times3 block positions, yielding nine pseudo-independent pairings.

This approach remains competitive with BM3D and other state-of-the-art denoisers under i.i.d. noise and substantially outperforms all blind or zero-shot baselines under directional (e.g., stripe) noise, with PSNR gains of 2–3 dB and significant SSIM improvement (Wang et al., 2 Oct 2025).

3. Self-Supervision, Architectural Constraints, and Loss Design

A broad range of architectures supports zero-shot denoising, exploiting distinct self-supervision mechanisms:

  • Implicit neural representations (INRs): Directly parameterize an MLP mapping from spatial coordinates to pixel values. The "spectral bias" of INRs allows low frequencies (clean image) to be fitted before high-frequency noise. Denoising leverages early stopping or penalizes deeper-layer weights to suppress fitting of high-frequency components (Kim et al., 2022).
  • Masked autoencoding and blind-spot strategies: Employ random pixel masking during prediction (e.g., Domino Denoise, Self2Self, Noise2Void) or input masking coupled with validation via pixel domino tilings for accuracy and early stopping (Lequyer et al., 2022).
  • Super-resolution and subsampling: Leverage random subsampling and inverse pixel-shuffle to synthesize paired noisy images with perfect alignment (Noise2SR), ensuring large-scale supervision and statistical invariance for ultra-low SNR tasks (Tian et al., 2024).
  • Frequency-domain approaches: Exploit the cross-frequency consistency of image textures versus noise incoherence, enabling training of ultralight networks guided by specially designed frequency-decomposition losses (ZSCFC) and circumventing pixel independence or zero-mean assumptions (Jiang et al., 14 Oct 2025).

Loss functions are correspondingly adapted, ranging from symmetric N2N/BCE residuals and self-supervised MSE to cross-frequency or correlation-preserving objectives, as in the intensity-guided lifetime preservation network for FLIM (Chen et al., 17 Mar 2025).

4. Physics-Informed and Domain-Specific Strategies

Zero-shot denoising has been successfully extended into domain-specific and physics-informed regimes, especially where the data-generating process is well characterized:

  • Computed tomography (CT): Sinogram flicking exploits conjugate ray redundancy. By randomly swapping pairs of conjugate detector values, the expected signal is preserved but the noise realization decorrelates, enabling efficient pseudo-pair generation and training in the sinogram (radon) domain with full resolution retained (Shi et al., 10 Apr 2025). Table below summarizes quantitative comparison on simulated CT data.
Method PSNR (dB) SSIM (%)
SART 26.83 59.69
BM3D 29.40 66.56
ZS-N2N 34.64 70.42
Sinogram Flicker (ours, 2-stage) 34.77 73.62
  • Ultrasound CPWC: Angles of plane wave transmission are partitioned into disjoint sets, yielding images with similar tissue but uncorrelated noise. Residual learning on these paired images, with a lightweight CNN and gradient consistency, delivers contrast enhancements approaching high-angle compounding without external data (Asgariandehkordi et al., 26 Jun 2025).
  • Fluorescence-lifetime imaging microscopy (FLIM): Multichannel intensity-lifetime data are denoised with guidance from a pre-trained intensity prior, leveraging correlation-preserving and fidelity terms to ensure preservation of biological contrast in the lifetime domain, outperforming both classical and deep learning baselines on real data (Chen et al., 17 Mar 2025).

5. Generative and Diffusion Model Priors in Zero-Shot Denoising

Recent advances leverage pre-trained generative priors, especially diffusion models, for linear and nonlinear inverse problems, including denoising:

  • Diffusion Probabilistic Priors: A cascade of unconditional diffusion models is trained on high-quality images. Given a noisy observation, denoising is performed by maximum a posteriori (MAP) estimation, where the prior is given by the diffusion model and the likelihood is constructed from the degraded observation. Adaptive coefficient estimation (AdaLam) enables per-image balancing of data fidelity and prior strength, and performance routinely matches or surpasses fully supervised baselines (Liu et al., 2023).
  • Denoising Diffusion Null-Space Model (DDNM): For inverse problems y=Ax+ϵy = Ax + \epsilon, DDNM imposes strict data-consistency by decomposing solutions into range and null-space components and propagating the measurement constraint at every diffusion step. Enhanced variants (DDNM+^+) introduce soft range correction and time-travel heuristics for noisy settings. In the pure denoising (A=IA=I) case, DDNM+^+ can achieve PSNR comparable to classical zero-shot denoisers while guaranteeing analytical data consistency (Wang et al., 2022).
  • Training-Free Zero-Shot Anomaly/Defect Generation: "DeltaDeno" uses synchronized diffusion sampling with minimal prompt alteration (e.g., "object" vs. "object with defect") and accumulates per-step denoising discrepancies to localize and synthesize anomalies, with mask-guided latent inpainting and spatial attention biasing (Xu et al., 21 Nov 2025).

6. Theoretical Guarantees, Regularization, and Error Control

Zero-shot neural compression denoising (ZS-NCD) formalizes the denoiser as a patchwise maximum-likelihood projection onto a neural compressor's codebook, regularized by an entropy bottleneck to prevent overfitting (Zafari et al., 15 Jun 2025). Concrete theoretical guarantees are derived, bounding the mean-squared error of the reconstruction in terms of codebook rate, distortion, and noise characteristics. The entropy constraint enables plug-and-play denoising for both Gaussian and Poisson noise without early stopping, and empirical results validate superior performance on diverse natural and scientific images.

Similarly, explicit layer-wise regularization in implicit representation architectures, and scheduling of data-fidelity vs. prior strength in diffusion models, further provide algorithmic mechanisms for balancing noise suppression and signal fidelity, especially when the noise model is unknown or mis-specified (Kim et al., 2022, Liu et al., 2023, Zafari et al., 15 Jun 2025).

7. Limitations, Performance and Future Directions

While zero-shot denoising methods eliminate external data dependence and can generalize to complex, structured, or unseen noise, several challenges persist:

  • Computational cost: Many methods require per-image optimization, with runtimes ranging from a few seconds (ultralight networks, cross-frequency methods (Jiang et al., 14 Oct 2025)) to minutes or hours (neural compression (Zafari et al., 15 Jun 2025), diffusion priors (Wang et al., 2022, Liu et al., 2023)).
  • Noise model limitations: Fully model-agnostic approaches may trade-off maximum attainable fidelity against universality, as seen with over-smoothing in random-subsampling schemes (Tian et al., 2024).
  • Generalizability to high-dimensional, multimodal, nondestructive or nonlinear settings: While video and FLIM extensions exist, theoretical and practical challenges remain for domains with extremely high SNR disparity or unknown structural artifacts (Wang et al., 2 Oct 2025, Wang et al., 18 Nov 2025).
  • Data-geometry and physical constraints: Future directions include exploiting more advanced physical acquisition models, adaptive partition or filtering strategies, or joint regularization in highly heterogeneous data (Asgariandehkordi et al., 26 Jun 2025).

Zero-shot denoising has evolved from a narrow solution for i.i.d. noise to a versatile toolbox applicable to a wide range of imaging and signal restoration tasks. Ongoing research efforts continue to expand theoretical foundations, computational efficiency, and adaptation to new science and engineering domains (Wang et al., 2 Oct 2025, Jiang et al., 14 Oct 2025, Kim et al., 2022, Zafari et al., 15 Jun 2025, Chen et al., 17 Mar 2025, Liu et al., 2023, Wang et al., 2022, Wang et al., 18 Nov 2025, Tian et al., 2024, Shi et al., 10 Apr 2025, Yan et al., 3 Jul 2025, Lequyer et al., 2022).

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Zero-Shot Denoising.