Papers
Topics
Authors
Recent
2000 character limit reached

Low-Rank Approximation Overview

Updated 6 February 2026
  • Low-rank approximation is the process of representing high-dimensional matrices or tensors with simplified structures that preserve essential features.
  • Techniques such as SVD, pivoted QR, and randomized algorithms effectively control approximation error under various norms.
  • These methods enable practical applications including data compression, noise reduction, and accelerated computations in scientific and machine learning contexts.

Low-rank approximation is the process of representing a high-dimensional matrix or tensor by another matrix or tensor of lower rank such that the approximation error (with respect to a chosen norm or metric) is minimized or suitably controlled. This concept is central across numerical linear algebra, data science, signal processing, scientific computing, machine learning, and high-dimensional model reduction. It enables compression, acceleration of computations, noise reduction, and extraction of salient latent structures.

1. Mathematical Formulation and Theoretical Foundations

Given a matrix ARm×nA \in \mathbb{R}^{m \times n} and a target rank k<min(m,n)k < \min(m, n), the classical low-rank approximation problem seeks BB with rank(B)k\mathrm{rank}(B) \leq k minimizing the residual

minrank(B)kAB\min_{\mathrm{rank}(B) \leq k} \|A - B\|

for a chosen norm, commonly the spectral norm (2\|\cdot\|_2), Frobenius norm (F\|\cdot\|_F), or entrywise p\ell_p norm. The Eckart–Young–Mirsky theorem states that truncating the singular value decomposition (SVD) of AA to its top kk singular values yields an optimal solution for all unitarily invariant norms: Ak=UkΣkVk,AAkF=i=k+1rσi2A_k = U_k \Sigma_k V_k^\top, \qquad \|A - A_k\|_F = \sqrt{\sum_{i=k+1}^{r}\sigma_i^2} where A=UΣVA = U \Sigma V^\top (Lu, 2024, Kumar et al., 2016).

Extensions of this theorem apply to infinite-dimensional Hilbert–Schmidt operators, with optimality and explicit expressions for continuous cases, as in kernel and continuous Dynamic Mode Decomposition (DMD) (Heas et al., 2018). For higher-order tensors, low-rank approximation generalizes to minimization over multilinear ranks, e.g., best (r1,...,rd)(r_1, ..., r_d)-approximation in the Frobenius norm (Friedland et al., 2014, Nouy, 2015).

2. Core Algorithms: Deterministic, Randomized, and Structured

Deterministic methods exploit algebraic decompositions:

  • SVD: Optimal, cubic cost, dense storage—gold standard for full accuracy.
  • Pivoted QR / Rank-Revealing QR (RRQR): Deterministic, approximates leading singular subspace, typically used with column/row selections (Kumar et al., 2016).
  • LU-based algorithms: Such as Spectrum-Revealing LU (SRLU) (Anderson et al., 2016) or Randomly Pivoted LU (RPLU), balancing accuracy, memory, and speed (Gilles et al., 29 Jan 2026).
  • Tensor methods: Higher-order SVD (HOSVD), CP, Tucker, Tensor-Train (TT), and hierarchical Tucker (HT) formats for tensors, with greedy, alternating-optimization, or Newton-type solvers (Friedland et al., 2014, Nouy, 2015).

Randomized algorithms enable near-optimal approximations at substantially lower computational cost, employing sketching, random projections (SRHT/SRFT), and leverage-score sampling to quickly generate compressed orthogonal bases which approximate the range of AA (Kumar et al., 2016, Pan et al., 2016, Pan et al., 2019).

Structured low-rank approximation incorporates additional constraints:

  • Structured matrices and weights: Optimize over linear or affine subspaces (e.g., Hankel, Toeplitz, Sylvester structures) and entrywise weightings, requiring algebraic-geometric or symbolic-numeric algorithms for exactness (Ottaviani et al., 2013, Rey, 2013).
  • p\ell_p and 0\ell_0 norms: NP-hard in general (p2p \neq 2), recent work gives bicriteria and sublinear algorithms for 0\ell_0 and constant-factor for p\ell_p (Chierichetti et al., 2017, Bringmann et al., 2017).
  • Clustering-based methods: Combine clustering with low-rank projections to exploit heterogeneity, reducing reconstruction error over "vanilla" generalized low-rank models (Zhu et al., 20 Feb 2025).
Method Type Time Complexity Error Guarantee
SVD O(mn2)O(mn^2) or O(n3)O(n^3) σk+1\sigma_{k+1} (optimal)
Randomized SVD O(mn(k+p))O(mn(k+p)) (1+ϵ)σk+1(1+\epsilon)\sigma_{k+1}
CUR/Cross/ACA O(k2(m+n))O(k^2(m + n)) (k+1)2minBAB(k+1)^2\min_{B} \|A-B\|
LU-based (SRLU) O(pmn+(m+n)k2)O(pmn + (m+n)k^2) γσk+1\gamma \sigma_{k+1}
RPLU O(k(n+m)+kM(A)+k3)O(k(n+m) + k M(A) + k^3) 4kAAkF24^k \|A - A_k\|_F^2 (exp.)

3. Advanced Algorithms: Sublinear, Structured, and Large-Scale Regimes

For extreme-scale scenarios and data with structure:

  • Sublinear/Recursive Cross-Approximation: Recursive sampling of rows/columns, with a few cross-approximation loops sufficing for high accuracy except on specially constructed "hard" inputs. Leverage-score integration further improves error bounds (Pan et al., 2019, Pan et al., 2016).
  • RPLU and Low-Memory Algorithms: RPLU achieves geometric convergence in expectation when singular values decay rapidly. It excels in memory-constrained regimes (e.g., GPUs, massive matrices), reducing storage to O(k2+n+m)O(k^2+n+m) and using only matvecs with A,ATA, A^T (Gilles et al., 29 Jan 2026).
  • Structured Low-Rank Problems: Algebraic--geometric frameworks compute all critical points for structured and weighted low-rank constraints, enabling global minimizer certification for e.g., Hankel-structured approximations (Ottaviani et al., 2013). Weighted versions may admit finitely many global minima, with sensitivity to the weight distribution (Rey, 2013).
  • Analytic Kernel Regimes: For matrices sampled from kernels analytic in one variable, exponential singular value decay is proved via rational interpolation with Zolotarev functions, leading to efficient explicit algorithms matching the best SVD decay rate (Webb, 17 Sep 2025).

4. Low-Rank Approximation for High-Dimensional Tensors

For high-order tensors, methods address the curse of dimensionality via:

  • Hierarchical Tensor Formats: CP, Tucker, HT, and TT formats exploit multilinear low-rank structure, reducing storage and computation from exponential to linear or polynomial in the order and rank (Nouy, 2015).
  • Alternating Maximization and Newton Schemes: Nearly all practical solvers use block coordinate ascent (AMM/ALS); recently, Newton-type methods for the optimal multilinear projection subspaces offer quadratic local convergence in small or well-conditioned settings (Friedland et al., 2014).
  • Fixed-Point and Error Decay Theory: For operator equations on tensor product Hilbert spaces, constructive contraction/iteration theory proves algebraic decay of best rank-rr errors uniformly in dimension, guaranteeing that decay rates depend only on problem conditioning and low operator ranks, not on dimensionality (Kressner et al., 2014).
  • Greedy and Subspace-Based Algorithms: Proper generalized decompositions (PGD), greedy basis construction, and adaptive cross approximation yield scalable solvers for high-dimensional PDEs, parameter-dependent problems, and tensor completion (Nouy, 2015).

5. Specialized Applications and Model Adaptation

Low-rank approximation methodology underpins a range of advanced applications:

  • Dynamical Systems and Operator Approximation: Closed-form optimal low-rank formulas for Hilbert--Schmidt operators provide theoretical and algorithmic foundation for kernel and continuous DMD (Heas et al., 2018).
  • Convolutional Neural Network Compression: Low-rank decomposition of convolutional layers (via matrix or tensor methods such as SVD, Tucker, CP), with additional regularization strategies (e.g., DeepTwist) to find compression-friendly flat minima, can achieve high compression with negligible or improved accuracy when applied properly (Lee et al., 2019).
  • Robust/Nonlinear Signal/Matrix Recovery: Algorithms for p\ell_p and 0\ell_0 low-rank approximation with provable error bounds accommodate applications sensitive to outliers and discrete/binary data, using bicriteria, randomized, and covering-based algorithms (Chierichetti et al., 2017, Bringmann et al., 2017).
  • Clustering and Data Structure Discovery: Joint clustering and low-rank codes (e.g., CGLRAM) merge GLRAM with clustering for matrix-structured data, often reducing reconstruction error relative to conventional techniques (Zhu et al., 20 Feb 2025).
  • Time-Dependent Simulation: Dynamical low-rank projection with operator splitting allows the time evolution of PDEs and kinetic equations within a low-rank manifold, yielding large memory and computational savings while preserving accuracy (Peng et al., 2019).

6. Limitations, Open Problems, and Future Directions

  • NP-hardness and Non-convexity: General low-rank approximation in p\ell_p (p2p \neq 2), weighted, or tensor settings is NP-hard; only partial guarantees exist for approximate algorithms (Chierichetti et al., 2017, Ottaviani et al., 2013, Rey, 2013).
  • Global vs. Local Minima: For many structured and weighted problems, multiple local (and even global) minima exist (Rey, 2013). No general algorithm finds all global solutions, but algebraic geometry yields certificates in some cases (Ottaviani et al., 2013).
  • Expressiveness in Model Adaptation: In neural adaptation (e.g., LoRA/LoKr/LoHA), the trade-off between parameter efficiency and expressiveness depends on the chosen structured low-rank scheme, rank allocation, and factorization type (Lu, 2024).
  • Error Bounds and Stopping Criteria in Approximate/Randomized Algorithms: The design of novel sparse or structured sampling matrices, fine-grained error control across classes of inputs, and robust stopping procedures remain active topics (Pan et al., 2016, Pan et al., 2019, Webb, 17 Sep 2025).
  • High-Order and High-Dimensional Scalability: For tensors, efficient contraction algorithms, format selection, and automated rank adaptation for dynamic scenarios are points of ongoing research (Nouy, 2015, Kressner et al., 2014).

7. Representative Applications and Benchmark Results

  • Dimensionality Reduction: PCA, latent semantic analysis, collaborative filtering, and compressed sensing.
  • Scientific Computing: Fast solvers for PDEs, kernel and integral equations, dynamical systems (DMD), parametric/stochastic model reduction.
  • ML/NLP Model Adaptation: Efficient fine-tuning of large transformer models with negligible loss in accuracy using low-rank updaters (LoRA/LoKr/LoHA) (Lu, 2024).
  • Compression and Denoising: Image, signal, and video compression (rank constraints filter noise; CUR and SVD underpin core compressors).
  • Big Data, Streaming, and Distributed Systems: Sublinear and streaming algorithms, enabled by sketching and randomized techniques, compress and approximate massive-scale matrices on distributed architectures (Kumar et al., 2016, Pan et al., 2016, Pan et al., 2019).

These methods remain at the core of algorithmic and applied linear algebra, enabling tractable solutions for otherwise intractable high-dimensional problems. State-of-the-art development continues in nearly all aforementioned axes, with special attention to structured, scalable, and application-tailored low-rank approximations.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Low-Rank Approximation.