Least-Squares Minimal Residual (LSMR)

Updated 21 October 2025

LSMR is an iterative Krylov subspace method based on Golub–Kahan bidiagonalization that efficiently solves large-scale, sparse least-squares problems.
It guarantees a monotonic decrease in the dual residual, enabling stable early termination and effective regularization for ill-posed systems.
LSMR supports advanced preconditioning and is widely applied in MR imaging, inverse problems, and large-scale data fitting.

The Least-Squares Minimal Residual (LSMR) algorithm is an iterative Krylov subspace method designed to solve large, sparse or structured linear systems and least-squares problems of the form $\min \|Ax - b\|_2$ , where $A$ is typically non-square and either sparse or a fast operator. The method is based on the Golub–Kahan bidiagonalization process and is mathematically equivalent to solving the normal equations $A^{\mathrm{T}}A x = A^{\mathrm{T}}b$ using the MINRES method. LSMR guarantees monotonic decrease of the dual residual $\|A^{\mathrm{T}} r_k\|$ (with $r_k = b - Ax_k$ ), which has important implications for stable early termination and algorithmic regularization. LSMR has emerged as a preferred iterative solver in computational inverse problems, regularized imaging, and large-scale data fitting across scientific domains.

1. Algorithmic Foundations and Mathematical Structure

LSMR is fundamentally built upon the Golub–Kahan bidiagonalization. At each iteration $k$ , the method constructs two orthonormal sequences $\{u_i\}$ and $\{v_i\}$ starting from $u_1 = b/\|b\|_2$ and $v_1 = A^{\mathrm{T}} u_1/\|A^{\mathrm{T}} u_1\|_2$ , with subsequent recurrences: $A v_k = \alpha_k u_k + \beta_{k+1} u_{k+1}$

$A^{\mathrm{T}} u_{k+1} = \beta_{k+1} v_k + \alpha_{k+1} v_{k+1}$

Two successive QR factorizations are applied to the arising bidiagonal matrix in each step. This reduces the high-dimensional least-squares problem to a structured subproblem involving only the bidiagonal matrices. The update to the approximate solution $x_k$ is obtained by solving for $y_k$ in the projected subproblem and setting $x_k = Q_k y_k$ , where $Q_k$ is the basis generated by the bidiagonalization.

LSMR is formulated to minimize the norm $\|A^{\mathrm{T}} r_k\|_2$ , with the residual $r_k = b - Ax_k$ . The connection to MINRES on the normal equations ensures that this quantity decreases monotonically. Explicit recurrence relations are employed for updating all relevant quantities in a matrix-free fashion (requiring only efficient computation with $A$ and $A^{\mathrm{T}}$ ).

LSMR, LSQR, CGLS, and MINRES share algorithmic ancestry in Krylov subspace techniques:

Method	Equivalent Operator	Monotonic Quantities	Regularization Role
LSQR	CG on $A^{\mathrm{T}}A$	$\\|r_k\\|$	iteration count
LSMR	MINRES on $A^{\mathrm{T}}A$	$\\|A^{\mathrm{T}} r_k\\|$ (and often $\\|r_k\\|$ )	iteration count
CGME	CG on $AA^{\mathrm{T}}$	no monotonic guarantee	iteration count
MINRES	symmetric $A^{\mathrm{T}}A$	norm of residual in induced inner product	iteration count

Unlike LSQR, which ensures monotonic reduction only in the primal residual $\|r_k\|$ , LSMR ensures this for the dual residual $\|A^{\mathrm{T}} r_k\|$ , and in practice for $\|r_k\|$ as well. Extensions of LSMR such as flexible modified variants (FMLSMR) (Yang et al., 29 Aug 2024) and hybrid LSMR (Yang, 13 Sep 2024) integrate further numerical and regularization robustness, including reducing the cost per iteration and improving inner problem conditioning.

Sharp theoretical bounds established in multiple studies (Jia, 2016, Jia, 2018) show that for severely and moderately ill-posed problems (where singular values of $A$ decay rapidly), LSMR approximates the best rank- $k$ truncation of $A$ at each step and its regularized solution at semi-convergence is as accurate as the best truncated SVD (TSVD) regularization. For mildly ill-posed problems, LSMR has only partial regularization and hybrid methods are recommended.

3. Monotonicity, Stopping Criteria, and Stability

A critical feature of LSMR is the monotonic decrease of $\|A^{\mathrm{T}} r_k\|$ , which translates into strong stability properties. The stopping criterion relying on the backward error

$\|A^{\mathrm{T}} r_k\| < \text{ATOL} \cdot (\|A\|\|r_k\|)$

is safer and more predictable in LSMR than in LSQR, given monotonicity. This property avoids overshooting and erratic early stopping, which can be problematic particularly in inconsistent or noisy systems.

The monotonic behavior is also observed in the primal residual $\|r_k\|$ in practice, even though this is only strictly monotonic in LSQR theoretically. Backward error estimates computed within LSMR serve as reliable proxies for solution optimality and termination (Fong et al., 2010, Wood, 2022).

4. Regularization and Spectral Filtering Properties

LSMR plays a central role in regularizing discrete ill-posed problems, where the number of iterations $k$ acts as a regularization parameter. At each iteration, the algorithm projects $A^{\mathrm{T}}A$ onto a Krylov subspace, so that only dominant singular components (associated to large singular values) are retained in the solution. This "spectral filtering" effect is mathematically quantified by sharp bounds on the rank- $k$ approximation error

$\sigma_{k+1} \leq \gamma_k \leq \sqrt{1+\eta_k^2}\sigma_{k+1}$

where $\gamma_k$ is the error of the projection at step $k$ , and $\sigma_{k+1}$ is the $(k+1)$ -st singular value of $A$ . For severe and moderate ill-posedness, this filtering is optimal or near-optimal (Jia, 2016, Jia, 2018).

Noise propagation in residuals is explicitly characterized via the coefficients of linear combinations of the left bidiagonalization vectors. Drawing on formulae such as

$r_k^{(\mathrm{LSMR})} = \sum_{l=0}^k [\cdots] s_{l+1}$

with coefficients weighted by amplification factors and recurrence parameters that reflect the degree to which each vector is contaminated by noise (Hnětynková et al., 2016). This structure allows for detailed analysis of regularization effect and noise control across iterations.

5. Preconditioning and Extensions

Preconditioning is frequently used to accelerate convergence in LSMR, particularly in rank-deficient or highly ill-conditioned cases. The symmetric splitting approach (Morikuni, 2015) wraps stationary iterative steps (e.g. Jacobi, SSOR) around each LSMR iteration, using $A^{\mathrm{T}}A = M-N$ with the iteration matrix $H = M^{-1}N$ . The number of inner iterations $\ell$ raises the spectral radius to the $\ell$ -th power, leading to improved convergence estimates

$\|\hat{r}_k\|_2 \leq \min\{ \nu(H)^{k\ell},\ 2[(\sqrt{\kappa^{(\ell)}}-1)/(\sqrt{\kappa^{(\ell)}}+1)]^k \} \|\hat{r}_0\|_2$

with $\nu(H)$ the pseudo-spectral radius. This makes LSMR robust even for singular $A$ , with theoretical guarantees for least-squares and minimum-norm solutions.

Flexible and modified versions of LSMR (such as FMLSMR) (Yang et al., 29 Aug 2024) further reduce cost per iteration, enabling use of nonstationary or adaptable preconditioners: only one inner linear system in terms of $M = L^{\mathrm{T}} L$ is solved per iteration, and preconditioners can vary adaptively step-by-step.

Hybrid LSMR algorithms (Yang, 13 Sep 2024) incorporate explicit regularization operators $L$ (for derivative constraints, total variation, etc.), projecting $A$ to Krylov subspaces, solving reduced regularized problems, and employing LSQR for inner solves whose conditioning systematically improves as Krylov subspace dimension $k$ increases.

6. Applications in Large-Scale Inverse Problems and Imaging

LSMR excels in large and sparse applications, including MR image reconstruction (Wood, 2022), where the imaging model naturally leads to large, severely ill-conditioned least squares problems, and direct inversion is impractical. LSMR leverages matrix-free operations and the favorable monotonicity properties to improve numerical stability and assure lower image-space residuals, especially when techniques such as Toeplitz embedding for CG acceleration are unavailable or impractical.

In noncartesian MR, 2D/3D data, and other scientific inverse problems, the ability of LSMR to avoid explicit formation of the normal equations (thereby preserving the condition number of $A$ ) and to facilitate robust early termination translates into significant computational and numerical advantages.

Further, in numerical PDE discretizations, LSMR-type minimal-residual formulations can be enhanced by neural-controlled weights (Brevis et al., 2022), allowing stability and quasi-optimality in finite element contexts.

7. Comparative Analysis With Randomized Preconditioning and Direct Solvers

Compared to randomized preconditioned normal equations approaches (Ipsen, 24 Jul 2025), LSMR does not require explicit preconditioner construction but can suffer from slow convergence when $A$ is extremely ill-conditioned. Randomized preconditioning can reduce the effective condition number to near optimal ( $\kappa(A_p) \approx 1$ ), with solution accuracy similar to QR-based solvers. In practice, LSMR remains competitive when preconditioners are expensive or impractical to compute, and supports adaptive accuracy control via regularization and Krylov filtering.

Approach	Accuracy for ill-conditioned $A$	Preconditioner cost	Suitability
LSMR (Krylov)	High, regularized by iteration	None; iterative	Sparse, large, operator
Randomized precond. normal eq.	High with effective preconditioner	Upfront (sampling)	Full-rank, dense, costly
QR-based (direct)	Highest, but most expensive	None	Small, dense

A plausible implication is that, for very large, ill-posed, or operator-defined problems where direct or randomized preconditioning is prohibitive, LSMR (and its flexible/hybrid variants) remains the method of choice due to minimal memory requirements, strong monotonicity properties, and robust error control.

References to Key Results

Golub–Kahan bidiagonalization and LSMR algorithm details (Fong et al., 2010)
Regularization theory and semi-convergence (Jia, 2016, Jia, 2018)
Monotonicity, stability, and backward error analysis (Fong et al., 2010, Hnětynková et al., 2016)
Preconditioning theory for rank-deficient/ill-conditioned systems (Morikuni, 2015, Yang et al., 29 Aug 2024)
Hybrid and neural network–enhanced LSMR (Yang, 13 Sep 2024, Brevis et al., 2022)
Applications to MR image reconstruction (Wood, 2022)
Randomized preconditioned normal equations (Ipsen, 24 Jul 2025)

LSMR has demonstrated effective performance and theoretical guarantees as a general-purpose least squares solver, particularly for large-scale, sparse, or ill-posed systems where robust iterative regularization and stable early termination are essential.