ShaRP: Stochastic Deep Restoration Priors

Updated 10 June 2026

The paper introduces an ensemble-based variational prior using MMSE restoration operators to improve recovery of images with structured artifacts.
It proposes a stochastic gradient descent algorithm that balances data fidelity and deep prior regularization while supporting self-supervised learning.
Empirical benchmarks show that ShaRP outperforms traditional plug-and-play and diffusion models in tasks like MRI reconstruction and super-resolution.

Stochastic Deep Restoration Priors (ShaRP) constitute a class of variational priors for imaging inverse problems, leveraging ensembles of deep restoration networks instead of traditional Gaussian denoisers. This approach generalizes plug-and-play regularization by drawing on minimum mean square error (MMSE) restoration operators trained for a variety of degradation models, providing improved recovery quality for data suffering from structured artifacts as well as supporting self-supervised learning from corrupted measurements. ShaRP unifies recent perspectives on score-based, denoiser-based, and restoration-operator-based inference, and sits at the intersection of plug-and-play optimization and stochastic differential approaches for scientific and medical image restoration (Hu et al., 2024, Zhang et al., 2024).

1. Mathematical Foundation

Consider the canonical linear inverse problem in imaging,

$y = A x + \eta,\quad \eta \sim \mathcal{N}(0, \sigma_\eta^2 I)$

with $A \in \mathbb{R}^{m \times n}$ and $x \in \mathbb{R}^n$ denoting the ground truth image. Conventional variational inference posits

$\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$

where $g(x) = \frac{1}{2} \| A x - y \|_2^2$ enforces data fidelity and $R(x)$ regularizes with an implicit or explicit prior.

ShaRP defines $R(x)$ through an ensemble of MMSE restoration operators $\{T_{\theta_k}\}_{k=1}^b$ . Each $T_{\theta_k}(z)$ approximates $\mathbb{E}[x|z, \phi_k]$ for a specific degradation $A \in \mathbb{R}^{m \times n}$ 0. Tweedie’s formula relates the MMSE estimator to the score function $A \in \mathbb{R}^{m \times n}$ 1 of the neural prior $A \in \mathbb{R}^{m \times n}$ 2. The ShaRP regularizer is thus:

$A \in \mathbb{R}^{m \times n}$ 3

where $A \in \mathbb{R}^{m \times n}$ 4 indexes restoration tasks and $A \in \mathbb{R}^{m \times n}$ 5 injects stochasticity.

The gradient simplifies to

$A \in \mathbb{R}^{m \times n}$ 6

which is the expected residual of restoration over random degradations $A \in \mathbb{R}^{m \times n}$ 7 and $A \in \mathbb{R}^{m \times n}$ 8.

These developments mean ShaRP optimizes

$A \in \mathbb{R}^{m \times n}$ 9

with a prior enforcing global consistency with the distribution of clean images implicitly modeled by the restoration ensemble (Hu et al., 2024).

2. Optimization and Algorithmic Scheme

ShaRP employs a biased stochastic-gradient descent on $x \in \mathbb{R}^n$ 0:

At iteration $x \in \mathbb{R}^n$ 1, sample a degradation index $x \in \mathbb{R}^n$ 2 and i.i.d. noise $x \in \mathbb{R}^n$ 3.
Compute $x \in \mathbb{R}^n$ 4.
Form a stochastic gradient

$x \in \mathbb{R}^n$ 5

and update via

$x \in \mathbb{R}^n$ 6

where $x \in \mathbb{R}^n$ 7 is chosen $x \in \mathbb{R}^n$ 8 for gradient-Lipschitz constant $x \in \mathbb{R}^n$ 9.

Illustrative pseudocode:

$g(x) = \frac{1}{2} \| A x - y \|_2^2$ 3

This stochastic descent framework efficiently alternates between data consistency and application of non-Gaussian deep priors, accommodating arbitrary or task-dependent degradations (Hu et al., 2024).

3. Theoretical Guarantees

The principal theoretical results can be summarized as:

When the restoration operators $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 0 are exact MMSE maps, ShaRP’s update direction is an unbiased stochastic gradient of the objective $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 1. Convergence theorems borrow from non-convex stochastic optimization.
For approximate $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 2 with bounded estimation bias $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 3 and variance $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 4, and under mild regularity (Lipschitz gradient, finite lower bound), the iterates satisfy

$\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 5

where $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 6 is a random iterate from the trajectory. This quantifies descent to an $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 7 stationary point, justifying the use of inexact (real-world) deep restorers (Hu et al., 2024).

4. Self-Supervised Restoration Priors

A significant feature of ShaRP is the capacity for self-supervised training—critical for domains lacking ground-truth clean data. For example, in compressed sensing MRI, restoration networks $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 8 are trained by mapping between reconstructions from disjoint sampling masks $\hat{x} = \underset{x \in \mathbb{R}^n}{\arg\min}\, g(x) + \lambda R(x)$ 9 of the same underlying slice:

$g(x) = \frac{1}{2} \| A x - y \|_2^2$ 0

This “Noise2Noise” loss enables learning approximate MMSE restoration without access to fully sampled $g(x) = \frac{1}{2} \| A x - y \|_2^2$ 1. The resulting networks can directly serve as priors in ShaRP, with random $g(x) = \frac{1}{2} \| A x - y \|_2^2$ 2 sampled at each gradient step (Hu et al., 2024).

A plausible implication is that similar strategies may extend to other inverse problems where only corrupted samples are available, broadening the scope of restoration priors beyond denoising and standard supervised pipelines.

5. Empirical Performance and Benchmarks

ShaRP achieves state-of-the-art or superior performance in multiple representative tasks:

Compressed Sensing MRI (CS-MRI):

Method	PSNR (dB)	SSIM
PnP-FISTA	≈35.88	≈0.938
Diffusion-DNS (DDS)	≈35.21	≈0.937
ShaRP (supervised)	37.59	0.963

Self-Supervised MRI:

Method	PSNR (dB)	SSIM
SPICER	≈31.87	≈0.901
ShaRP^self	33.87	0.909

Single-Image Super-Resolution (SISR): (Deblurring, σ=1.25, noiseless)

Method	PSNR (dB)	SSIM
DPIR	28.10
DiffPIR	28.92
DRP	29.28
ShaRP	30.09	0.891

For high noise and stronger blur (e.g., kernel σ=1.5, noisy), ShaRP continues to outperform diffusion-based and denoiser-based priors without additional fine-tuning for novel measurement operators or noise levels. Qualitatively, ShaRP preserves fine textures and suppresses ringing even under structured degradations (Hu et al., 2024).

6. Relationship to Stochastic Restoration Priors in Other Domains

Recent work in Digital Elevation Model (DEM) restoration exemplifies the “Stochastic Deep Restoration Prior” paradigm in a distinct mathematical guise (Zhang et al., 2024). DEM-SDE represents terrain degradation and restoration as a mean-reverting Itô SDE, and learns the reverse drift and score via deep networks conditioned on learnable terrain priors. Although the functional form and application domains are different—imaging versus geospatial DEM super-resolution and inpainting—both ShaRP and DEM-SDE rely on stochastic sampling through deep restoration operators, attention to structured artifacts, and conditioning on learned non-Gaussian priors.

A plausible implication is that ShaRP-like frameworks may be extensible to any domain where structured data degradations are amenable to synthetic corruption and self-supervised deep restoration learning, such as hyperspectral imaging, 3D volumetric reconstruction, or beyond.

7. Significance and Future Directions

ShaRP advances the state of deep regularization for inverse problems by:

Generalizing variational plug-and-play priors from Gaussian denoisers to ensembles of restoration operators,
Theoretically grounding the approach via stochastic approximation and score-divergence regularization,
Supporting self-supervised training pipelines crucial in scientific and biomedical imaging,
Demonstrating robust gains in image quality over established diffusion-model and plug-and-play restoration methods.

Emergent research directions include expanding the space of admissible degradations for restoration priors, automatic adaptation of noise and regularization levels, and cross-domain generalization using domain-specific restoration architectures (Hu et al., 2024, Zhang et al., 2024). This suggests a shift from hand-crafted image priors toward flexible, data-driven architectures that explicitly model and invert realistic, structured data corruptions.

Markdown Report Issue Upgrade to Chat

References (2)

Stochastic Deep Restoration Priors for Imaging Inverse Problems (2024)

Efficient Terrain Stochastic Differential Efficient Terrain Stochastic Differential Equations for Multipurpose Digital Elevation Model Restoration (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Stochastic Deep Restoration Priors (ShaRP).