Regularization by Denoising (RED)

Updated 6 October 2025

Regularization by Denoising (RED) is a framework that reformulates inverse imaging problems by explicitly embedding state-of-the-art denoisers into the optimization cost.
It offers a clear energy functional which integrates denoising algorithms directly, yielding theoretical guarantees of convexity and convergence under certain conditions.
RED demonstrates superior reconstruction performance in tasks like deblurring and super-resolution, and supports flexible optimization strategies such as gradient descent and ADMM.

Regularization by Denoising (RED) is a class of methodologies that reformulate inverse problems—particularly in imaging—by using modern denoising algorithms as the foundation for the regularizer in a unified optimization framework. RED distinguishes itself from traditional variational regularization by embedding state-of-the-art denoisers directly into the definition of the prior, promoting reconstructions that reside near the “fixed-point set” of the denoiser. The RED approach thereby shifts the role of the denoiser from a mere post-processing operator to a central component of the optimization cost. This has enabled significant advances in inverse imaging, owing to the strong empirical performance and adaptability of learned and model-based denoisers.

1. Core Methodology and Mathematical Framework

The principal idea in RED is to encode the prior via the denoising engine itself. Let $f(\cdot)$ be a denoiser. RED defines the regularization penalty as

$\rho(x) = \frac{1}{2} x^\top (x - f(x))$

where $x \in \mathbb{R}^n$ is the image variable. Under mild conditions on $f$ —notably, local homogeneity and strong passivity—this functional is convex and Fréchet differentiable, yielding the gradient

$\nabla \rho(x) = x - f(x)$

as the regularization force. The full objective for a generic linear inverse problem $y = Hx + \eta$ (linear operator $H$ , measurements $y$ , noise $\eta$ ) is formulated as

$E(x) = \ell(y, x) + \lambda \rho(x) = \ell(y, x) + \frac{\lambda}{2} x^\top (x - f(x))$

where $\ell(y, x)$ is the data-fidelity term, typically quadratic for AWGN (e.g., $(1/2\sigma^2)\|Hx-y\|_2^2$ ), and $\lambda$ is the regularization parameter. The stationary points are characterized by solving

$\nabla \ell(y, x) + \lambda(x - f(x)) = 0$

The explicit formulation of both the objective and its gradient enables seamless integration with general-purpose optimization algorithms.

2. Advantages over Implicit Plug-and-Play Schemes

Unlike the Plug-and-Play Priors (PnP) approaches—which use a denoiser “plugged into” an algorithm such as ADMM without an explicit underlying cost—RED constructs an explicit regularization functional. The prior is therefore not implicit but is wholly defined in closed form, greatly improving the transparency and theoretical justification of the resulting Bayesian or variational model. This explicitness facilitates the analysis of convexity, uniqueness, and convergence properties (under suitable conditions), and opens access to acceleration techniques or alternative optimizers beyond ADMM (Romano et al., 2016).

3. Optimization Strategies

A broad range of iterative schemes are enabled by the explicit RED objective:

Gradient Descent: Immediate application using $\nabla E(x)$ .
Conjugate Gradient / Subspace Methods: Exploitation of problem structure (e.g., circulant $H$ ).
ADMM / Variable Splitting: The data-fidelity and prior terms can be split across variables, with the denoising step kept explicit in the update.
Fixed-Point Iteration: Solving $\nabla \ell(y, x) + \lambda(x - f(x)) = 0$ directly.
Weighted Proximal Methods (WPM): Generalization to variable-metric updates, leading to quasi-Newton acceleration of RED-type objectives (Hong et al., 2019).

The explicit form of the cost function and its gradient allows exploitation of any optimization procedure compatible with differentiable objectives, visibility into convergence rates, and the flexibility to accelerate inversion with vector extrapolation or adaptive step-size selection (Hong et al., 2018, Hong et al., 2019).

4. The Role and Requirements of the Denoiser

For the RED framework to retain its theoretical guarantees, the denoising operator must satisfy specific properties:

Local Homogeneity: $f(cx) \approx c f(x)$ for $c$ near 1.
Strong Passivity (or Nonexpansivity): The spectral radius (or the Lipschitz constant) of the Jacobian $J_f$ is at most 1.
Symmetric Jacobian: For some theoretical results (gradient simplification, convexity proofs), $J_f = J_f^\top$ is required (Reehorst et al., 2018). In practice, high-performance denoisers such as TNRD, BM3D, or CNN-based architectures can be effective even if these conditions are only approximately satisfied. When these properties do not strictly hold, explicit cost-function interpretations may fail, but the fixed-point formulation and empirical effectiveness remain robust (Reehorst et al., 2018). Recent research has extended the framework to denoisers that are contractive or averaged operators, facilitating rigorous convergence even for complex, learned denoisers (Nair et al., 2022).

5. Applications and Empirical Performance

RED has demonstrated strong performance in a variety of inverse problems in imaging, including:

Image Deblurring: State-of-the-art results on uniform and Gaussian blur kernels with additive noise. RED restores sharpness and reduces noise beyond competing schemes such as NCSR or IDD-BM3D. Results show improved PSNR and favorable visual quality (Romano et al., 2016).
Super-Resolution: RED achieves reconstructions of high visual fidelity and quantitative SNR, especially when a strong denoiser is used in the regularizer.
Other Modalities: The explicit framework supports adaptation to compressed sensing, MRI, tomography, and non-imaging inverse problems, paving the way for applications across scientific imaging domains (Romano et al., 2016).
Comparison to PnP: RED often matches or outperforms PnP-ADMM approaches, with the additional benefit of a well-defined energy functional and more tractable parameter selection due to the clarity of the underlying objective.

6. Convergence Properties

Convexity and global optimality in RED are guaranteed when both the fidelity term and the denoiser-induced regularizer are convex—which is ensured by local homogeneity and strong passivity (or nonexpansivity) of the denoiser, and convexity of $\ell(y,x)$ (Romano et al., 2016). In these regimes, iterative minimization schemes converge to the unique global solution. The decoupling of prior and solution strategy increases overall robustness. By contrast, in PnP schemes with implicit priors, convergence may only be to a fixed point, with no guarantee of global minimization.

7. Research Directions and Extensions

Future research avenues highlighted in foundational RED papers include:

Generalization to broader classes of denoisers: Leveraging deep learning-based denoisers under controlled regularity conditions may yield further advances.
Adaptive and data-driven parameter selection: Dynamically adjusting denoiser parameters (e.g., noise level in $f(x)$ ) during optimization remains largely unexplored.
Higher-order and composite regularization: Penalizing higher moments of the residual $x - f(x)$ or introducing composite formulations for structured priors (e.g., group sparsity) represent promising directions.
Application to new problem domains: Tomographic reconstruction, segmentation, optical flow, and more general inverse problems may benefit from RED’s explicit cost and flexibility.
Handling non-differentiable or less regular denoisers: Theoretical advances continue to analyze RED under relaxed smoothness or symmetry conditions (Nair et al., 2022, Reehorst et al., 2018).
Parameter tuning methodologies: Empirical and theoretically motivated schemes for automatic $\lambda$ selection, possibly using Stein’s Unbiased Risk Estimation (SURE), are of practical importance (Romano et al., 2016).

Table: Comparison of RED and Plug-and-Play (P³) Frameworks

Aspect	RED	PnP (P³)
Regularizer	Explicit, cost function in closed form	Implicit, no explicit objective
Prior Incorporation	Direct via energy term $\rho(x)$	Indirect via denoising step (usually ADMM)
Optimization	Any gradient-based method	Primarily ADMM, others less interpretable
Theoretical Props	Convexity, convergence to global optimum (some)	Fixed-point convergence; global optimality?
Parameter selection	More transparent (via $\lambda$ in objective)	More heuristic, less clear

References

"The Little Engine that Could: Regularization by Denoising (RED)" (Romano et al., 2016)
"Regularization by Denoising: Clarifications and New Interpretations" (Reehorst et al., 2018)
"Averaged Deep Denoisers for Image Regularization" (Nair et al., 2022)
"Solving RED with Weighted Proximal Methods" (Hong et al., 2019)
"Acceleration of RED via Vector Extrapolation" (Hong et al., 2018)

The RED framework provides a mathematically principled and engineering-flexible pathway for integrating learned and analytical denoisers into the solution of inverse problems. By leveraging the explicit structure of the regularizer defined through a denoising operator, RED enables convergence guarantees, adaptability across modalities, and the potential for further advances as new denoisers and optimization strategies are developed.

PDF Markdown Chat (Pro)

References (5)

The Little Engine that Could: Regularization by Denoising (RED) (2016)

Solving RED with Weighted Proximal Methods (2019)

Acceleration of RED via Vector Extrapolation (2018)

Regularization by Denoising: Clarifications and New Interpretations (2018)

Averaged Deep Denoisers for Image Regularization (2022)

Follow Topic

Get notified by email when new papers are published related to Regularization by Denoising (RED).