Regularized Kaczmarz Method

Updated 30 August 2025

The Regularized Kaczmarz Method is an iterative projection algorithm that incorporates penalization to stabilize solutions in inconsistent or ill-posed linear systems.
It extends the classic Kaczmarz method by integrating regularization techniques, including Tikhonov, block, and sparse approaches, to improve convergence and robustness.
Its practical applications in image reconstruction, compressed sensing, and high-dimensional data analysis highlight its effectiveness in addressing noise and underdetermined systems.

The Regularized Kaczmarz Method refers to a broad class of iterative projection algorithms for solving (usually overdetermined and possibly inconsistent or ill-posed) linear systems and, more generally, convex-regularized inverse problems. Unlike the classical Kaczmarz method—which relies on sequential orthogonal projections onto single hyperplanes defined by the rows of $A$ —regularized variants introduce penalization or structural constraints to obtain stable, often unique or physically meaningful solutions, even under data inconsistency, ill-conditioned matrices, or underdetermined systems. Modern regularized Kaczmarz methods appear in forms ranging from simple Tikhonov-type penalization at each projection step to sophisticated block, randomized, and tensor-structured algorithms, including algorithms for sparse and low-rank recovery, and enjoy extensive theoretical and numerical analysis for convergence and robustness.

1. Conceptual Foundation and Classical Formulations

The traditional Kaczmarz method for $Ax=b$ with $A\in\mathbb{R}^{m\times n}$ iteratively updates the solution estimate $x^{(k)}$ by projecting onto the solution space of a single row: $x^{(k+1)} = x^{(k)} + \frac{b_i - \langle a_i, x^{(k)}\rangle}{\|a_i\|^2} a_i$ where $a_i$ is the $i$ -th row of $A$ , and $i$ is chosen either cyclically or, in the randomized variant, with probability proportional to $\|a_i\|^2$ .

Regularization augments each Kaczmarz update, generally seeking to mitigate the adverse effects of noise, data inconsistency, or poor conditioning. A canonical regularized step solves at each iteration: $x^{(k+1)} = \arg\min_x \left\{ \frac{1}{2}(b_i - \langle a_i, x\rangle)^2 + \frac{\lambda}{2}\|x - x^{(k)}\|^2 \right\}$ yielding the explicit update: $x^{(k+1)} = x^{(k)} + \frac{b_i - \langle a_i, x^{(k)}\rangle}{\|a_i\|^2 + \lambda} a_i$ Here, $\lambda$ is the regularization parameter. When $\lambda=0$ this reduces to the classical Kaczmarz step (Ferreira et al., 5 Jan 2024).

This formulation is a local, per-iteration analogue of Tikhonov (ridge) regularization and can be viewed as enforcing stability by preventing large, noise-amplifying corrections, particularly in directions with small singular values.

2. Extensions to Inconsistent, Ill-posed, and Noisy Problems

In practical inverse problems and large-scale data science, the system $Ax=b$ is often inconsistent due to noise or modeling errors, or $A$ is ill-conditioned. The classical Kaczmarz method stalls within a "convergence horizon" determined by the error component in the orthogonal complement of the range of $A$ (Kang et al., 2022).

Regularized Kaczmarz methods address this by:

Explicit penalization in each step as above (Ferreira et al., 5 Jan 2024).
Incorporating auxiliary "error-removal" steps, as in the extended Kaczmarz (EK) and randomized extended Kaczmarz (REK) algorithms, which interleave projections onto the data's row and column spaces (or their tensor generalizations), enabling convergence to the least-squares solution even for inconsistent systems (Needell et al., 2014, Petra et al., 2015, Du et al., 2021).
Using deterministic control strategies, such as almost cyclic or maximal-residual index selection, to ensure all constraints are addressed adequately for robust convergence (Petra et al., 2015).

Table 1 summarizes common Kaczmarz variants for inconsistent systems:

Method	Handles Inconsistency	Regularization	Solution Type
Classical Kaczmarz	No	N/A	Arbitrary/lossy
Tikhonov-Kaczmarz	Yes	Tikhonov ( $\lambda$ )	Regularized LS
Extended Kaczmarz	Yes	Implicit	Least-squares
Sparse Kaczmarz	Yes (with mods)	$\ell_1$ / $\ell_2$	Sparse LS/Basis Pursuit

3. Structured and Randomized Block Regularized Kaczmarz Methods

Block and randomized Kaczmarz methods are leveraged for parallelism, acceleration, and improved convergence on large-scale systems. Regularization is essential for stability when projecting onto subspaces defined by blocks of $A$ :

The Regularized Block Kaczmarz (ReBlocK) method introduces a Tikhonov regularization in the pseudoinverse used for block projections:

$x_{t+1} = x_t + A_{S_t}^\top (A_{S_t}A_{S_t}^\top + \lambda k I)^{-1}\left(b_{S_t} - A_{S_t}x_t\right)$

where $S_t$ is the block of rows chosen at iteration $t$ , $k$ is block size, and $\lambda$ regularizes the inversion, providing control over the condition number and the variance of updates (Goldshlager et al., 2 Feb 2025).

In randomized block settings, uniform sampling (without preconditioning or partitioning with well-bounded conditioning) generally leads to a weighted least-squares solution whose conditioning may deteriorate with poorly conditioned blocks; regularization mitigates this effect by bounding the condition number of the block weight matrix and suppressing high variance (Goldshlager et al., 2 Feb 2025).
Momentum and acceleration techniques (e.g., Kaczmarz++, CD++) further enhance convergence, especially when exploiting the singular value structure of $A$ , by learning dominant subspaces and incorporating regularized projections (Dereziński et al., 20 Jan 2025).

4. Regularization for Structure: Sparsity and Low-rank Recovery

A major extension involves embedding structural priors into the regularized Kaczmarz framework:

Sparse Kaczmarz: To recover sparse solutions, the regularization function is taken as $f(x) = \lambda\|x\|_1 + \frac{1}{2}\|x\|_2^2$ $f (x) = λ ∥ x ∥_{1} + \frac{1}{2} ∥ x ∥_{2}^{2}$ , and each update is followed by a soft-thresholding/shrinkage operator in the dual/primal variable (Schöpfer et al., 2016, Wang et al., 21 Jun 2024).
- For instance, the update is:
$x_{k+1} = S_\lambda\left[x_{k} + \frac{b_i - \langle a_i, x_k \rangle}{\|a_i\|^2} a_i\right]$
Surrogate Hyperplane Methods: Rather than projecting onto a single row-based hyperplane, projections utilize a residual-weighted aggregate, with the update direction $A^\top\eta_k$ where $\eta_k$ is typically chosen as the current residual. This can yield provable linear convergence rate advantages over both row-randomized and standard sparse Kaczmarz updates, especially when residual information is employed adaptively (Wang et al., 21 Jun 2024).
Tensor Regularized Kaczmarz: For tensor-valued $X$ in $A*X=B$ , regularization can target sparsity ( $\|\cdot\|_1$ ), tubal low-rankness (via a nuclear norm over t-SVD singular tubes), or composite objectives. Kaczmarz-like updates operate slice-wise via the t-product, and convergence is guaranteed under strong convexity and appropriate step sizes (Chen et al., 2021, Du et al., 2021, Henneberger et al., 2023).

5. The Role of Randomization, Greedy Selection, and Adaptive Sampling

Randomized row/block selection is crucial to minimize the effects of equation correlation and stagnation. Regularized Kaczmarz variants further benefit from:

Residual-weighted sampling: Equations with larger residuals receive higher selection probability, accelerating error reduction and implicitly dampening error components associated with small singular values, thus exerting a regularization effect (in the sense of suppressing the influence of unstable directions) (Steinerberger, 2020).
Greedy/Maximal Residual Rule: Choosing at each step the equation with the largest residual (maximal correction) further exploits regularization by actively targeting (and thus quickly eliminating) the largest error modes (Petra et al., 2015, Steinerberger, 2020).
Johnson–Lindenstrauss Acceleration: RKJL (randomized Kaczmarz via the JL lemma) enables approximate best-row selection among a candidate set at almost no extra cost, by projecting inner products into a lower-dimensional space, thereby accelerating convergence and adaptively regularizing selection (Eldar et al., 2010).

6. Convergence Analysis and Theoretical Guarantees

Theoretical analyses for regularized Kaczmarz variants are typically expressed in terms of expected or per-iteration error decay: $\mathbb{E}\|x_k - x^*\|^2 \leq (1 - \theta)^k \|x_0 - x^*\|^2 + \mbox{(error floor)}$ where $\theta$ reflects the contraction factor, determined by matrix conditioning, regularization parameters, block size, or structure constants (e.g. for sparse / tensor Kaczmarz, the geometry of the Bregman distance induced by the regularizer is central) (Schöpfer et al., 2016, Needell et al., 2014, Wang et al., 21 Jun 2024, Henneberger et al., 2023).

Regularization shifts the "error floor" lower (or to zero, if enough iterations or sufficient penalization is applied), and, in many settings, improves the contraction factor either by stabilizing block projections or through favorable step-size tuning (Goldshlager et al., 2 Feb 2025). For inconsistent or noisy problems, convergence is to the regularized least squares or basis pursuit solution, up to a noise-dependent tolerance.

7. Practical Performance, Applications, and Algorithmic Considerations

Robustness and scalability of regularized Kaczmarz methods are demonstrated on large-scale problems:

In computed tomography (CT) and image reconstruction, regularized iterative row-action methods reduce streak artifacts, converge rapidly, and handle inconsistent/view-limited data (Ferreira et al., 5 Jan 2024, Needell et al., 2014).
In compressed sensing and sparse recovery, regularized sparse Kaczmarz algorithms reconstruct sparse vectors or images efficiently, and are especially effective with randomized selection and/or surrogate hyperplane projections (Schöpfer et al., 2016, Wang et al., 21 Jun 2024).
Regularized block Kaczmarz methods outperform vanilla RBK and mini-batch SGD in neural (e.g., natural gradient) optimization and ill-posed systems with sharply decaying singular spectra, as the regularization keeps the variance and conditioning favorable, and enables rapid tail-averaged convergence (Goldshlager et al., 2 Feb 2025).
In high-order tensor problems (e.g., video inpainting, multiway data unmixing), log-sum or nuclear norm regularized tensor Kaczmarz forms unify stochastic updates with convex/nonconvex structural priors, delivering state-of-the-art recovery on synthetic and real data (Henneberger et al., 2023, Chen et al., 2021).

Algorithmic choices, such as the use of batching, block memoization, and adaptive regularization parameter selection, further accelerate convergence and can be tuned to exploit the singular value spectrum for beyond-Krylov convergence in ill-conditioned systems (Dereziński et al., 20 Jan 2025).

In sum, the Regularized Kaczmarz Method encompasses a family of highly-tunable, theoretically-underpinned algorithms, adaptable via choice of regularizer, row/block/tensor selection, and error-corrective strategy. Their proven convergence, variance control, and flexibility make them a central tool for modern large-scale, noisy, and structurally-constrained inverse problems in computational mathematics, imaging, and data science.