Sketched GMRES: Randomized Krylov Solvers

Updated 8 July 2025

Sketched GMRES is a Krylov subspace method that employs randomized sketching to compress computations and mitigate the expense of full orthogonalization in large nonsymmetric linear systems.
It replaces exact orthogonalization with low-dimensional randomized projections, preserving key geometric and spectral features for efficient least-squares minimization.
Recent extensions include block, tensor, and recycling variants that boost practical scalability and maintain backward stability in high-dimensional scientific computing.

Sketched GMRES refers to a family of Krylov subspace methods for solving large (and often nonsymmetric) linear systems that combine the GMRES framework with randomized sketching or dimensionality reduction, yielding improved efficiency, lower memory cost, and potentially enhanced scalability compared to classical implementations. The core idea is to replace exact, often communication-intensive orthogonalization—and sometimes the least-squares problem at each outer iteration—with randomized projections or compressed computations that preserve essential geometric or spectral features while reducing arithmetic and communication overhead. Recent developments extend this approach to block, tensor, and recycling settings, with rigorous theoretical analyses on stability and convergence matching the needs of large-scale scientific computing environments.

1. Algorithmic Principles of Sketched GMRES

Sketched GMRES modifies the classical GMRES method by introducing a random sketching operator into the construction of the Krylov basis or the residual minimization. In standard GMRES, an orthonormal basis $V_m$ for the Krylov subspace is generated, and the update

$x_m = x_0 + V_m y_m, \quad y_m = \arg \min_{y} \| AV_m y - r_0 \|,$

is computed. Critically, this necessitates storing and orthogonalizing all basis vectors, which is prohibitively expensive for large $n$ .

In sketched GMRES, an embedding (or sketching) matrix $S \in \mathbb{C}^{s \times n}$ is drawn so that for vectors $v$ in the search space,

$(1-\epsilon)\|v\|^2 \leq \| S v \|^2 \leq (1+\epsilon)\|v\|^2$

with $s \ll n$ . The minimization becomes

$y_m = \arg \min_y \| S(AV_m y - r_0) \|,$

or, equivalently, the orthogonalization and solution steps can be performed entirely in the reduced (sketched) space (2111.00113, 2208.11447). This allows for the use of possibly nonorthogonal or partially orthogonal bases (e.g., from truncated Arnoldi), with sketching compensating for the weaker basis properties. Extensions exist for block methods, s-step variants, and settings involving matrix functions or tensor problems (2409.09471).

2. Randomized Orthogonalization and Subspace Embeddings

A central computational bottleneck in classical GMRES is Gram–Schmidt orthogonalization, which is highly communication- and synchronization-intensive. Modern sketched GMRES variants leverage randomized Gram–Schmidt procedures, where the core operations (such as projections and norm computations) are performed using the sketches $S w_j$ (2011.05090):

New basis candidates are sketched and orthogonalized in the compressed space.
Least-squares or QR factorizations are computed only in the reduced dimension, often with sketch size $s = O(d/\epsilon^2)$ for dimension $d$ .
This approach maintains nearly orthonormal "sketched" bases (i.e., $S Q$ is nearly orthonormal), which is sufficient for the convergence and stability of the global GMRES scheme.

Recent advances integrate sketching into block orthogonalization for s-step GMRES (2503.16717), resulting in orthogonality errors for each block bounded by machine precision (as long as the blocks are numerically full rank). Specific sketching schemes (e.g., dense Gaussian, sparse Count sketch, hybrid Count–Gaussian) can be used to reduce communication volume in distributed environments.

3. Numerical Stability and Error Analysis

Sketched methods fundamentally change the error propagation landscape. Classical GMRES’s backward stability analysis depends on the conditioning of the Krylov basis. For sketched GMRES, recent analysis demonstrates that:

Backward stability depends on the product $AB_{1:i}$ : As long as $\kappa(AB_{1:i})$ is not too large (e.g., less than about $1/O(u)$), the method is backward stable, even if the Krylov basis itself is ill-conditioned (2503.19086).
Restarting can mitigate error buildup: Restarted sketched GMRES is shown to decouple error accumulation from the worst-case conditioning of the basis, maintaining small backward error cycle by cycle.
Sharper error bounds: In many observed cases, backward error is much smaller than the worst-case theory would predict, because sketching compresses the amplification of error due to an ill-conditioned basis.

Randomized orthogonalization can also yield nearly machine-precision orthogonality using one or two Gauss–Seidel iterations (2205.07805). In distributed-memory settings, randomized sketching enhances stability of block orthogonalization without measurably increasing execution time (2503.16717).

4. Theoretical and Practical Convergence

Convergence theory for sketched GMRES extends polynomial approximation theory from classic Krylov methods. Provided the embedding operator $S$ preserves the geometry of the subspace, convergence rates for compressed least-squares matches those of true GMRES up to a modest factor, often bounded by $(1+\epsilon)/(1-\epsilon)$ (2111.00113, 2208.11447). For example, for Stieltjes functions of positive real matrices, error decays as

$\|f(A)-\widetilde{f}_m\|_{A^H} \leq C_1\,C_\varepsilon\, (\sin\beta_0)^m,$

with distortion due to sketching appearing only as a pre-factor (2208.11447).

For random matrix ensembles (e.g., Ginibre matrices), average-case sketched GMRES convergence can be characterized exactly, and worst-case bounds are established using pseudospectra and the numerical range (2303.02042). Empirical results confirm that, for well-designed sketches, practical convergence loss is modest compared to the dramatic gains in efficiency for large $n$ .

5. Enhanced Scalability and Parallel Performance

The primary motivation for sketched GMRES is improved efficiency and scalability. Key points include:

Reduced arithmetic complexity: Resource requirements drop from $O(nd^2)$ in classical GMRES to $O(d^3 + nd\log d)$ when solving the sketched least-squares problem (2111.00113). This reduction is particularly vital when orthogonalization dominates cost.
Communication minimization: By concentrating computation in the reduced or sketched space—and often batching communication events—parallel scalability on clusters or GPU-based systems is greatly enhanced (2503.16717, 2001.04886).
Memory footprint: Sketching enables the use of partially orthogonal or even unorthogonalized bases (e.g., via truncated Arnoldi), with the sketch itself providing sufficient conditioning for least-squares solves, significantly lowering storage costs.
Flexibility for recycling and tensor applications: Deflated and recycled Krylov spaces can be efficiently compressed and propagated in time-dependent or multi right-hand-side problems (e.g., GMRES-SDR, (2311.14206)). In high-dimensional tensor solvers, sketched GMRES efficiently combines incomplete orthogonalization and streaming tensor rounding (2409.09471).

6. Extensions: Mixed Precision, Preconditioning, and Recycling

Recent research demonstrates the compatibility of sketching with other algorithmic trends:

Mixed-precision algorithms: Preconditioners constructed using sketched QR factorizations in low (or mixed) precision can be integrated into GMRES-based iterative refinement for least-squares, with theoretical and empirical validation that they reach working-precision errors as long as conditioning permits (2410.06319).
Flexible and Nested Schemes: Flexible GMRES (FGMRES) can robustly wrap around a sketched GMRES inner solver, so that monotonic residual decrease and stability are guaranteed even when the inner solver stagnates or becomes unstable (2506.18408).
Block/Panel and Tensor Structures: Randomized sketching can be combined with s-step block methods and advanced tensor formats (such as TT/CP), using structured sketches and streaming rounding algorithms to maintain low rank or memory while leveraging the Krylov structure for convergence (2503.16717, 2409.09471).

7. Limitations and Comparisons with Classical Methods

Sketched GMRES achieves high efficiency at the potential cost of:

Loss of exact orthogonality: Inexact or truncated bases rely on the sketch as a geometric corrector, which can occasionally fail if the corresponding sketching dimension is too small or the basis becomes extremely ill-conditioned.
Parameter tuning: While recent theoretical advancements provide bounds and guidelines for embedding dimension and truncation, some parameter selection may still be required, especially for challenging or highly non-normal problems.
Accuracy tradeoff: Residual norms and solution accuracy are bounded to within a modest factor (e.g., ≤6 for ε = 1/√2) of the unsketched method, which may not always suffice in highly ill-conditioned applications (2111.00113).

However, these disadvantages are often offset by considerable speedups (10–100× in empirical tests), strong memory and communication savings, and robust backward stability for a wide range of applications.

In conclusion, sketched GMRES represents a well-founded and practically impactful paradigm for accelerating Krylov subspace solvers in the era of large-scale, memory- and communication-bound scientific computing. With algorithmic variants encompassing randomized orthogonalization, block and s-step variants, mixed-precision/sketched recycling, tensor-specific adaptations, and state-of-the-art numerical stability guarantees, sketched GMRES continues to broaden the envelope of feasible computations and maintain relevance for new generations of computational problems.