Papers
Topics
Authors
Recent
Search
2000 character limit reached

ShaRP: Stochastic Deep Restoration Priors

Updated 10 June 2026
  • The paper introduces an ensemble-based variational prior using MMSE restoration operators to improve recovery of images with structured artifacts.
  • It proposes a stochastic gradient descent algorithm that balances data fidelity and deep prior regularization while supporting self-supervised learning.
  • Empirical benchmarks show that ShaRP outperforms traditional plug-and-play and diffusion models in tasks like MRI reconstruction and super-resolution.

Stochastic Deep Restoration Priors (ShaRP) constitute a class of variational priors for imaging inverse problems, leveraging ensembles of deep restoration networks instead of traditional Gaussian denoisers. This approach generalizes plug-and-play regularization by drawing on minimum mean square error (MMSE) restoration operators trained for a variety of degradation models, providing improved recovery quality for data suffering from structured artifacts as well as supporting self-supervised learning from corrupted measurements. ShaRP unifies recent perspectives on score-based, denoiser-based, and restoration-operator-based inference, and sits at the intersection of plug-and-play optimization and stochastic differential approaches for scientific and medical image restoration (Hu et al., 2024, Zhang et al., 2024).

1. Mathematical Foundation

Consider the canonical linear inverse problem in imaging,

y=Ax+η,ηN(0,ση2I)y = A x + \eta,\quad \eta \sim \mathcal{N}(0, \sigma_\eta^2 I)

with ARm×nA \in \mathbb{R}^{m \times n} and xRnx \in \mathbb{R}^n denoting the ground truth image. Conventional variational inference posits

x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)

where g(x)=12Axy22g(x) = \frac{1}{2} \| A x - y \|_2^2 enforces data fidelity and R(x)R(x) regularizes with an implicit or explicit prior.

ShaRP defines R(x)R(x) through an ensemble of MMSE restoration operators {Tθk}k=1b\{T_{\theta_k}\}_{k=1}^b. Each Tθk(z)T_{\theta_k}(z) approximates E[xz,ϕk]\mathbb{E}[x|z, \phi_k] for a specific degradation ARm×nA \in \mathbb{R}^{m \times n}0. Tweedie’s formula relates the MMSE estimator to the score function ARm×nA \in \mathbb{R}^{m \times n}1 of the neural prior ARm×nA \in \mathbb{R}^{m \times n}2. The ShaRP regularizer is thus:

ARm×nA \in \mathbb{R}^{m \times n}3

where ARm×nA \in \mathbb{R}^{m \times n}4 indexes restoration tasks and ARm×nA \in \mathbb{R}^{m \times n}5 injects stochasticity.

The gradient simplifies to

ARm×nA \in \mathbb{R}^{m \times n}6

which is the expected residual of restoration over random degradations ARm×nA \in \mathbb{R}^{m \times n}7 and ARm×nA \in \mathbb{R}^{m \times n}8.

These developments mean ShaRP optimizes

ARm×nA \in \mathbb{R}^{m \times n}9

with a prior enforcing global consistency with the distribution of clean images implicitly modeled by the restoration ensemble (Hu et al., 2024).

2. Optimization and Algorithmic Scheme

ShaRP employs a biased stochastic-gradient descent on xRnx \in \mathbb{R}^n0:

  • At iteration xRnx \in \mathbb{R}^n1, sample a degradation index xRnx \in \mathbb{R}^n2 and i.i.d. noise xRnx \in \mathbb{R}^n3.
  • Compute xRnx \in \mathbb{R}^n4.
  • Form a stochastic gradient

xRnx \in \mathbb{R}^n5

and update via

xRnx \in \mathbb{R}^n6

where xRnx \in \mathbb{R}^n7 is chosen xRnx \in \mathbb{R}^n8 for gradient-Lipschitz constant xRnx \in \mathbb{R}^n9.

Illustrative pseudocode:

g(x)=12Axy22g(x) = \frac{1}{2} \| A x - y \|_2^23

This stochastic descent framework efficiently alternates between data consistency and application of non-Gaussian deep priors, accommodating arbitrary or task-dependent degradations (Hu et al., 2024).

3. Theoretical Guarantees

The principal theoretical results can be summarized as:

  • When the restoration operators x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)0 are exact MMSE maps, ShaRP’s update direction is an unbiased stochastic gradient of the objective x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)1. Convergence theorems borrow from non-convex stochastic optimization.
  • For approximate x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)2 with bounded estimation bias x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)3 and variance x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)4, and under mild regularity (Lipschitz gradient, finite lower bound), the iterates satisfy

x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)5

where x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)6 is a random iterate from the trajectory. This quantifies descent to an x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)7 stationary point, justifying the use of inexact (real-world) deep restorers (Hu et al., 2024).

4. Self-Supervised Restoration Priors

A significant feature of ShaRP is the capacity for self-supervised training—critical for domains lacking ground-truth clean data. For example, in compressed sensing MRI, restoration networks x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)8 are trained by mapping between reconstructions from disjoint sampling masks x^=argminxRng(x)+λR(x)\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)9 of the same underlying slice:

g(x)=12Axy22g(x) = \frac{1}{2} \| A x - y \|_2^20

This “Noise2Noise” loss enables learning approximate MMSE restoration without access to fully sampled g(x)=12Axy22g(x) = \frac{1}{2} \| A x - y \|_2^21. The resulting networks can directly serve as priors in ShaRP, with random g(x)=12Axy22g(x) = \frac{1}{2} \| A x - y \|_2^22 sampled at each gradient step (Hu et al., 2024).

A plausible implication is that similar strategies may extend to other inverse problems where only corrupted samples are available, broadening the scope of restoration priors beyond denoising and standard supervised pipelines.

5. Empirical Performance and Benchmarks

ShaRP achieves state-of-the-art or superior performance in multiple representative tasks:

Compressed Sensing MRI (CS-MRI):

Method PSNR (dB) SSIM
PnP-FISTA ≈35.88 ≈0.938
Diffusion-DNS (DDS) ≈35.21 ≈0.937
ShaRP (supervised) 37.59 0.963

Self-Supervised MRI:

Method PSNR (dB) SSIM
SPICER ≈31.87 ≈0.901
ShaRPself 33.87 0.909

Single-Image Super-Resolution (SISR): (Deblurring, σ=1.25, noiseless)

Method PSNR (dB) SSIM
DPIR 28.10
DiffPIR 28.92
DRP 29.28
ShaRP 30.09 0.891

For high noise and stronger blur (e.g., kernel σ=1.5, noisy), ShaRP continues to outperform diffusion-based and denoiser-based priors without additional fine-tuning for novel measurement operators or noise levels. Qualitatively, ShaRP preserves fine textures and suppresses ringing even under structured degradations (Hu et al., 2024).

6. Relationship to Stochastic Restoration Priors in Other Domains

Recent work in Digital Elevation Model (DEM) restoration exemplifies the “Stochastic Deep Restoration Prior” paradigm in a distinct mathematical guise (Zhang et al., 2024). DEM-SDE represents terrain degradation and restoration as a mean-reverting Itô SDE, and learns the reverse drift and score via deep networks conditioned on learnable terrain priors. Although the functional form and application domains are different—imaging versus geospatial DEM super-resolution and inpainting—both ShaRP and DEM-SDE rely on stochastic sampling through deep restoration operators, attention to structured artifacts, and conditioning on learned non-Gaussian priors.

A plausible implication is that ShaRP-like frameworks may be extensible to any domain where structured data degradations are amenable to synthetic corruption and self-supervised deep restoration learning, such as hyperspectral imaging, 3D volumetric reconstruction, or beyond.

7. Significance and Future Directions

ShaRP advances the state of deep regularization for inverse problems by:

  • Generalizing variational plug-and-play priors from Gaussian denoisers to ensembles of restoration operators,
  • Theoretically grounding the approach via stochastic approximation and score-divergence regularization,
  • Supporting self-supervised training pipelines crucial in scientific and biomedical imaging,
  • Demonstrating robust gains in image quality over established diffusion-model and plug-and-play restoration methods.

Emergent research directions include expanding the space of admissible degradations for restoration priors, automatic adaptation of noise and regularization levels, and cross-domain generalization using domain-specific restoration architectures (Hu et al., 2024, Zhang et al., 2024). This suggests a shift from hand-crafted image priors toward flexible, data-driven architectures that explicitly model and invert realistic, structured data corruptions.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Stochastic Deep Restoration Priors (ShaRP).