Provably Convergent Linear Plug-and-Play Denoisers

Updated 30 June 2025

Provably convergent linear plug-and-play denoisers are iterative schemes that embed linear denoising operators within optimization frameworks to guarantee stable convergence in inverse imaging tasks.
They leverage operator theory, spectral analysis, and strong convexity proofs to achieve linear convergence rates and unique solution guarantees under practical conditions.
By representing denoising operations as scaled proximal operators, these methods offer efficient algorithmic blueprints for applications like deblurring, inpainting, and superresolution.

Provably convergent linear plug-and-play denoisers are algorithmic constructs in which linear denoising operators are embedded within iterative optimization frameworks (notably ADMM, ISTA/FISTA, or variants), yielding explicit or implicit regularization that is provably stable and convergent across a range of inverse problems. These denoisers, including both symmetric and nonsymmetric (kernel-based) forms, can typically be cast as scaled proximal operators of convex quadratic regularizers. The theoretical underpinning for convergence is firmly established through a combination of operator theory, spectral analysis, contraction mapping principles, and—especially in recent work—strong convexity proofs for the resulting optimization problem. This body of research provides both transparent criteria for algorithm design and practical blueprints for application in high-dimensional image reconstruction tasks.

1. Strong Convexity and Its Role in Convergence

A central theoretical achievement is the demonstration that, for a broad class of linear denoisers, the optimization problems associated with plug-and-play algorithms are strongly convex. Consider the minimization problem: $\mathcal{P}: \min_{x \in \mathbb{R}^n} \ell(x) + \lambda\, \varphi(x)$ where $\ell(x)$ is the data fidelity term (typically least-squares on a forward model) and $\varphi(x)$ is the regularizer associated with the linear denoiser $W$ .

For symmetric linear denoisers ( $W \in S^n$ , $\sigma(W) \subset [0,1]$ ), the regularizer has the closed form: $\varphi_W(x) = \frac{1}{2} x^\top W^\dagger (I - W)x + \iota_{R(W)}(x)$ and, crucially, $W = \mathrm{prox}_{\varphi_W}$ .

The strong convexity of $\mathcal{P}$ is proved under the necessary and sufficient condition: $N(A) \cap \mathrm{fix}(W) = \{0\}$ where $N(A)$ is the null space of the forward operator $A$ , and $\mathrm{fix}(W)$ is the set of fixed points of $W$ . The strong convexity index is characterized by the spectrum of the composite operator $Q = A^\top A + \lambda W^\dagger(I-W)$ restricted to the subspace $R(W)$ .

For kernel (nonsymmetric) denoisers of the form $W = D^{-1} K$ , analogous strong convexity results are established in the inner product space $(\mathbb{R}^n, \langle\cdot,\cdot\rangle_D)$ with appropriately scaled regularizers. The result covers classic and high-dimensional inverse imaging scenarios like inpainting, deblurring, and superresolution, where the operational structure of $A$ and the irreducibility of $W$ typically guarantee the intersection condition.

Strong convexity ensures uniqueness of the minimizer and provides quantitative rates for convergence of the associated algorithms.

2. Spectral Contractivity and Linear Convergence

The proof of convergence for plug-and-play algorithms with linear denoisers leverages a combination of spectral criteria and averaged operator theory. In the prototypical PnP-ISTA or PnP-ADMM schemes, the update operator takes the form: $x_{k+1} = W\left(x_k - \gamma \nabla \ell(x_k)\right)$ where $W$ is the linear denoiser.

For symmetric denoisers, global linear convergence is obtained when the spectral norm $\|P\|_2 < 1$ for the iteration matrix $P = W(I-\gamma A^\top A)$ , with $0 < \gamma < 2$ . This follows from the contraction mapping theorem. For nonsymmetric kernel denoisers (such as standard nonlocal means, NLM), contractivity must be checked in the $D$ -norm, $\|x\|_D = \sqrt{x^\top D x}$ , where $D$ is the normalization matrix of the kernel.

Recent contributions provide explicit upper bounds on the contraction factor (and hence convergence rate) in terms of key spectral quantities: the spectral gap $(1 - \lambda_2)$ (where $\lambda_2$ is the second-largest eigenvalue of $W$ ), sampling rate $\mu$ , and, for PnP-ADMM, the highest modulus of eigenvalues away from unity for the operator $V = 2W - I$ .

These guarantees are demonstrated to be robust for classical imaging linear inverse problems, and for both symmetric (e.g., DSG-NLM, GMM) and kernel (nonsymmetric NLM) denoisers, including cases where prior theory required symmetric, doubly stochastic denoisers for convergence.

3. Proximal Map Representation and Algorithmic Implications

A key structural property is that many linear denoisers, including non-symmetric ones, can be represented as (possibly scaled) proximal operators of convex quadratic functionals, but potentially in weighted inner product spaces. This link enables the deployment of classical convex optimization theory and ensures that plug-and-play algorithms—if appropriately modified to take account of the geometry induced by the denoiser—can minimize a well-defined sum of convex functions.

For symmetric denoisers, the standard PnP update suffices. For non-symmetric (kernel) denoisers, rigorous theory and empirical results show that scaled PnP algorithms—where gradient and proximal steps are computed with respect to the denoiser-induced inner product, and where denoiser and guide image are frozen prior to the main iterations—guarantee convergence to the minimum of the sum objective $f+g$ .

Algorithmically, this leads to practical, efficient methods such as PnP-FISTA, PnP-ADMM, and their scaled variants, with convergence that is independent of initialization and robust to practical numerical issues.

4. Scaling and Regularization Interpretation

Recent work introduces explicit scaling mechanisms (e.g., denoiser scaling, Tweedie scaling, spectral filtering) to adjust the strength of regularization imparted by the denoiser. This is particularly critical for deep learning-based denoisers, which lack an explicit regularization parameter.

In the linear case, spectral filtering or scaling allows fine-grained tuning of the implicit quadratic regularizer: $J_\tau(x) = \frac{1}{2} \langle x, (h_\tau(D_\sigma)^{-1} - I) x \rangle$ where $h_\tau$ is a spectral filter applied to the denoiser’s spectrum. As the scaling parameter is varied, the scheme interpolates between no regularization and full denoiser strength. This provides not only interpretability, relating scaling to denoiser quality and regularizer strength, but also enables practical parameter choice strategies and convergence proofs in the spirit of classical Tikhonov regularization.

5. Applications, Stability, and Limitations

Provably convergent linear plug-and-play denoisers have been successfully applied to a broad array of linear inverse imaging problems: inpainting, deblurring, superresolution, and compressive sensing, with extensions to larger-scale scenarios using scalable ADMM variants. They provide guarantees of uniqueness and stability—both under fixed noise and as part of convergent regularization schemes (solutions approach the exact inverse as noise vanishes and regularization is reduced).

Empirical work demonstrates that the theoretical contraction rates and convergence bounds match observed performance, and that the methods are robust to initialization, non-ideal kernels, and variable sampling rates. Practical limitations include:

Reduced expressiveness compared to nonlinear or adaptively guided denoisers (e.g., CNN-based models),
The need to compute or estimate spectral properties for parameter tuning,
The necessity of freezing kernel or guide images for linearity,
The fact that the theory currently addresses the linear plug-and-play case, with extensions to nonlinear or learned denoisers remaining an active research direction.

6. Future Directions

Ongoing research is directed at:

Generalizing convergence theory to nonlinear, possibly learned, nonexpansive (or even just bounded) denoisers,
Exploiting spectral and contraction analysis for automatic denoiser and parameter selection,
Integrating spectral filtering and regularization scaling into large-scale or distributed PnP systems,
Extending convergence guarantees to more general inverse problem structures and to non-Euclidean settings,
Developing sharper or adaptive bounds for the contraction factors that predict convergence rates in practical settings,
Bridging theoretical results with "black-box" deep learning methods where explicit regularization structure is unknown.

PDF Markdown Chat (Upgrade)