Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Solving Dense Linear Systems Faster Than via Preconditioning (2312.08893v2)

Published 14 Dec 2023 in cs.DS, cs.LG, cs.NA, math.NA, and math.OC

Abstract: We give a stochastic optimization algorithm that solves a dense $n\times n$ real-valued linear system $Ax=b$, returning $\tilde x$ such that $|A\tilde x-b|\leq \epsilon|b|$ in time: $$\tilde O((n2+nk{\omega-1})\log1/\epsilon),$$ where $k$ is the number of singular values of $A$ larger than $O(1)$ times its smallest positive singular value, $\omega < 2.372$ is the matrix multiplication exponent, and $\tilde O$ hides a poly-logarithmic in $n$ factor. When $k=O(n{1-\theta})$ (namely, $A$ has a flat-tailed spectrum, e.g., due to noisy data or regularization), this improves on both the cost of solving the system directly, as well as on the cost of preconditioning an iterative method such as conjugate gradient. In particular, our algorithm has an $\tilde O(n2)$ runtime when $k=O(n{0.729})$. We further adapt this result to sparse positive semidefinite matrices and least squares regression. Our main algorithm can be viewed as a randomized block coordinate descent method, where the key challenge is simultaneously ensuring good convergence and fast per-iteration time. In our analysis, we use theory of majorization for elementary symmetric polynomials to establish a sharp convergence guarantee when coordinate blocks are sampled using a determinantal point process. We then use a Markov chain coupling argument to show that similar convergence can be attained with a cheaper sampling scheme, and accelerate the block coordinate descent update via matrix sketching.

Citations (5)

Summary

  • The paper introduces a stochastic algorithm that outperforms traditional preconditioning by solving dense linear systems more efficiently.
  • It leverages a novel randomized block Kaczmarz method coupled with a majorization-minimization sampling strategy to accelerate convergence.
  • It achieves near-linear time complexity for systems with a few dominant singular values, enhancing performance in machine learning and scientific computing.

Overview of Faster Linear System Solving Techniques

The paper introduces a stochastic optimization algorithm capable of solving dense n × n real-valued linear systems Ax = b with a time complexity that surpasses preconditioning approaches. This efficiency is especially pronounced when dealing with matrices characterized by a spectrum featuring a limited number k of singular values significantly larger than the smallest positive singular value.

Algorithmic Innovations

The proposed algorithm can be seen as a randomized block coordinate descent method. By incorporating a randomized block Kaczmarz method and a novel sampling strategy anchored in a majorization-minimization approach, the paper advances the method's efficiency.

Analysis Strategy

The analysis leverages majorization theory and Markov chain coupling to unravel the convergence behavior. Majorization for elementary symmetric polynomials is used, while the complexities of sampling and iterative updates are reduced by using an inexpensive Markov chain Monte Carlo sampling method post-RHT preprocessing and matrix sketching strategies.

Discussion of the Results

The core results underscore the achievement of a near-linear time complexity for solving linear systems with k large singular values, an accomplishment translating to faster computations in practice. Moreover, adapting for positive semidefinite matrices yields complexity improvements and outlines specific applications in fields like machine learning, scientific computing, and statistics.