Convergent Regularization Methods

Updated 26 January 2026

Convergent regularization methods are systematic frameworks that stabilize ill-posed inverse problems by enforcing that reconstructions continuously depend on observed, noisy data.
They integrate classical Tikhonov regularization with modern data-driven, plug-and-play, and proximal algorithms to achieve convergence to critical points even in nonconvex settings.
Practical implementations employ variational schemes, spectral filters, and hybrid regularizers to balance strong and weak convexity, ensuring robust error rates and computational efficiency.

A convergent regularization method is a systematic mathematical or algorithmic framework that stabilizes the solution of ill-posed problems—typically inverse problems or nonconvex, high-dimensional optimization—so that the reconstruction or estimate depends continuously on the observed (possibly noisy) data and converges to a meaningful solution as noise or regularization vanishes. Modern developments encompass classical variational approaches, operator-theoretic generalizations, adaptive data-driven approaches, and plug-and-play/proximal-based algorithms, with rigorous convergence guarantees extending from global minimizers to critical points even for nonconvex and learned regularizers.

1. Formal Definition and Classical Regularization Theory

Let $A : X \to Y$ be a bounded linear or nonlinear operator between Banach or Hilbert spaces, with the inverse problem of recovering $x^\dagger$ from noisy data $y^\delta = A x^\dagger + e$ , $\|e\|\leq \delta$ . When $A$ is not continuously invertible—ill-posed in Hadamard’s sense—regularization introduces a parameterized family of reconstruction maps

$R_\alpha : Y \rightarrow X$

together with a rule $\alpha = \alpha(\delta)$ such that

$R_{\alpha(\delta)}(y^\delta) \to x^\dagger \quad \text{as} \quad \delta \to 0,$

with convergence understood in the strong or weak topology and for all exact solutions $x^\dagger$ . The canonical example is Tikhonov regularization: $x_\alpha = \arg\min_{x}\ \|A x - y^\delta\|^2 + \alpha\,R(x),$ where $R(x)$ is a convex, lower semicontinuous penalty. Under suitable conditions on $R$ and choice of $\alpha(\delta)$ , this yields stability and convergence as $\delta \to 0$ (Ebner et al., 2022, Hauptmann et al., 2023, Kabri et al., 2022).

2. Convergence in Nonconvex, Data-driven, and Weakly Convex Settings

The classical framework enforces strong convexity or at least lower semicontinuity of $R$ . However, with the adoption of learned or nonconvex regularizers (e.g. neural networks), global minimization becomes intractable. The modern theory relaxes this to convergence to critical points:

Critical point regularization: Stationary points $x_\alpha^\delta$ satisfying $0 \in \partial J_\alpha(x_\alpha^\delta)$ , where $J_\alpha$ is the regularized objective, are shown sufficient to achieve convergence rates in Bregman or symmetric Bregman distances, even for weakly convex and nonconvex $R$ (Obmann et al., 2023, Shumaylov et al., 2024).
Weak convexity: If $f(x) + \frac{\rho}{2}\|x\|^2$ is convex for some $\rho \geq 0$ , $f$ is $\rho$ -weakly convex. This condition guarantees the existence and stability of critical points, enables the use of generalized prox-type algorithms, and supports universal approximation via input weakly convex neural networks (IWCNNs) (Goujon et al., 2023, Shumaylov et al., 2023, Shumaylov et al., 2024).
Strongly convex–weakly convex (CNC) regularization: Jointly combining a strongly convex component with a weakly convex (even nonconvex, data-driven) component gives structurally stable regularization with theoretical convergence guarantees for stationary points (Shumaylov et al., 2023).

3. Frameworks and Algorithms: Variational, Plug-and-Play, and Proximal Methods

Classical and modern convergent regularization schemes are implemented via:

Variational schemes: Direct minimization of composite functionals $J_\alpha(x)$ , with convexity or weak convexity/strong convexity decomposition (Goujon et al., 2023, Shumaylov et al., 2023, Shumaylov et al., 2024).
Plug-and-Play (PnP) methods: Replace proximal or gradient steps with a denoiser $D_\sigma$ , potentially nonlinear or learned. Convergence is established for PnP iterations under contractiveness and suitable parameter selection (Ebner et al., 2022, Hauptmann et al., 2023).
RED and monotone RED: Regularization by denoising leverages an implicit denoiser-based prior, with monotonic convergence ensured via an Armijo-type line search, removing the need for the denoiser to be nonexpansive (Hu et al., 2022).
Proximal/DRS/PGD with weakly convex or learnable proximal denoisers: By constraining the denoiser to the exact proximal of a weakly convex (possibly nonconvex) objective, one guarantees algorithmic convergence without requiring convexity of the regularizer or restrictive parameter settings (Hurault et al., 2022, Hurault et al., 2023).
Newton, hybrid, and higher-order regularization: Matrix-regularized Newton-type and trust-region methods with polynomial or elastic net penalties, with rigorous global and local convergence rates for convex and convex-composite objectives (Yamakawa et al., 2024, Cristofari, 13 Jun 2025).

4. Rates, Source Conditions, and Quantitative Behavior

Convergence rates in Bregman distance, residual, or objective values are established under source conditions:

Source condition: Existence of an auxiliary element linking the subgradient of the regularizer at the true solution to the range of $A^*$ (Obmann et al., 2023). This underpins precise error rates.
Convergence rates: In the convex or polyconvex case, canonical rates $O(\delta)$ or $O(\delta^{2\nu / (2\nu+1)})$ are preserved (Kirisits et al., 2016, Palumbo et al., 26 May 2025).
In nonconvex and weakly convex regimes, the rates are established in terms of Bregman or symmetric Bregman distances to account for possible non-uniqueness and non-global minima (Obmann et al., 2023, Shumaylov et al., 2024).

5. Specialized and Data-driven Convergent Regularization

Spectral and filtered regularizers: For linear inverse problems, data-driven spectral or Fourier-domain filters provide convergent regularization by learning optimal filter families from data distributions, with explicit formulas and convergence guarantees (Kabri et al., 2022, Hauptmann et al., 2023).
Convergent plug-and-play with linear/learned proximal denoisers: By constructing denoisers as explicit proximal operators (linear or gradient-based neural nets), provable convergence to fixed-points or stationary points is retained, and the effect of tuning regularization parameters is quantified (Hurault et al., 2022, Hurault et al., 2023).
Neural network–parametrized solution manifolds: Tikhonov functionals constrained to increasing families of neural networks with bounded parameter norms yield convergent regularization schemes, again achieving canonical rates if the parameterized class is dense (Palumbo et al., 26 May 2025).
Polyconvex regularization: Polyconvex integrands, relevant in imaging and elasticity, allow for weak solution formulations and rate proofs in terms of generalized Bregman distances beyond the range of convex functionals (Kirisits et al., 2016).

6. Algorithmic and Practical Considerations

Choice of regularization parameter: Link rules for $\alpha(\delta)$ are essential, typically enforcing that $\alpha(\delta)\to 0$ , $\delta^m / \alpha(\delta) \to 0$ for some $m$ determined by the fidelity penalty (Shumaylov et al., 2023, Palumbo et al., 26 May 2025).
Model adaptation and scalability: Krylov subspace Newton-type algorithms, mixed-precision solvers, and block-structure–preserving variants enable efficient scalable implementation for large-scale and high-dimensional problems (Cornelis et al., 2021, Cornelis et al., 2019).
Convergence guarantees extend to near-minimizers: It is sufficient to approximately solve the variational subproblem to a specified tolerance, preserving rates, which is crucial in large-scale nonconvex or learned settings (Obmann et al., 2023).
Empirical observations: Data-driven regularizers, especially those based on weakly convex or IWCNN-based networks, deliver state-of-the-art performance while ensuring theoretical stability, e.g., in sparse-view or limited-angle CT, MRI, and classical deblurring (Goujon et al., 2023, Shumaylov et al., 2023).

7. Connections, Implications, and Theoretical Extensions

Convergent regularization methods unify classical Tikhonov/variational frameworks with modern data-driven, plug-and-play, and deep learning–derived approaches under a shared theoretical umbrella:

Critical point theory enables justifying the convergence of algorithms for learned regularizers where global minimizers are impractical to compute.
Weak convexity and related regularity frameworks provide the minimal structural conditions to guarantee existence, stability, and convergence of regularized solutions.
Spectral and functional-analytic extensions ensure that convergent regularization is deeply connected to classical operator theory and admits precise characterizations and quantitative predictions for the convergence of regularized inverses (Hauptmann et al., 2023, González-Sanz et al., 2024).
Universal approximation by IWCNNs suggests that, in principle, arbitrary regularization behavior can be attained with carefully designed neural-network functionals while retaining provable convergence properties (Shumaylov et al., 2024).
Practical freedom in parameter tuning: Recent plug-and-play and nonconvex-proximal results remove the necessity for restrictive parameter bounds, restoring practitioner flexibility and ensuring practical performance does not negate theoretical guarantees (Hurault et al., 2023).

Collectively, these developments establish convergent regularization as the rigorous mathematical foundation underpinning modern, stable, and adaptive solution strategies for inverse and high-dimensional optimization problems, bridging classical theory, algorithmic implementation, and data-driven advancements.