Papers
Topics
Authors
Recent
2000 character limit reached

Generalized CUR Matrix Approximation

Updated 10 December 2025
  • Generalized CUR matrix approximation is a framework that selects actual matrix rows and columns to construct interpretable low-rank approximations for single or multiple matrices.
  • It employs techniques such as DEIM, convex optimization, and randomized sampling to achieve accurate approximations with rigorous error bounds in both Frobenius and spectral norms.
  • The method extends to joint, adaptive, robust, and tensorial settings, supporting applications in bioinformatics, imaging, multi-view learning, and reduced-order modeling.

A generalized CUR matrix approximation is a broad class of algorithms and theoretical frameworks for approximating matrices (or collections of matrices) using actual subsets of columns and rows, thereby providing interpretable, low-rank surrogates. Unlike the classical CUR decomposition—which targets a single matrix—generalized CUR (GCUR) schemes encompass joint approximations of matrix pairs, triplets, or more general multiview settings, extensions involving additional convex or spectral constraints, as well as tensor analogues and robust, adaptive, or randomized methods. This synthesis reviews the foundational principles, error guarantees, algorithmic developments, and practical relevance of generalized CUR matrix approximation, with emphasis on the most influential developments of the last decade.

1. Generalized CUR: Conceptual Scope and Principal Models

Generalized CUR matrix approximation refers to constructions of the form

ACUR,A \approx C U R,

where ARm×nA \in \mathbb{R}^{m \times n} is the data matrix, CRm×cC \in \mathbb{R}^{m \times c} collects selected columns, RRr×nR \in \mathbb{R}^{r \times n} collects selected rows, and URc×rU \in \mathbb{R}^{c \times r} is a “core” linking block. In the classical setting, UU is taken as the Moore–Penrose pseudoinverse of the intersection W=AI,JW = A_{I,J}:

U=W+=(AI,J)+.U = W^+ = (A_{I,J})^+.

Generalizations arise when:

The table below summarizes the main types:

Generalization Target(s) Selection Mechanism Error Objective
Matrix pair GCUR (A,B)(A, B) DEIM / GSVD Joint Frobenius/2\ell_2
CUR via convex optimization AA Convex row/col penalties Frobenius; feature selection
Randomized GCUR (A,B,G)(A, B, G) Random sampling, DEIM Frobenius
Interlacing spectral CUR AA (or (A,B,C)(A, B, C)) Interlacing polynomials Spectral norm
CUR with adaptive oversampling AA Adaptive QR, cross blocks Frobenius/2\ell_2
Tensor CUR A\mathcal{A} Mode-wise/fiberwise Multiway Frobenius

2. Core Algorithms and Selection Principles

Generalized CUR approximations are constructed via several algorithmic paradigms:

a) Discrete Empirical Interpolation (DEIM) and Generalized SVD (GSVD):

  • The DEIM method greedily selects column and row indices maximizing the interpolation residual for the (generalized) left and right singular vectors, ensuring stable recovery and near-optimal interpolation (Sorensen et al., 2014, Cao et al., 2023, Gidisu et al., 2021).
  • For GCUR, DEIM is applied to the generalized singular vectors obtained from GSVD or restricted SVD of matrix pairs or triplets, aligning the selected indices for coordinated approximation (Gidisu et al., 2021, Gidisu et al., 2022).

b) Convex Optimization Approaches:

  • Convex-penalty (e.g., \ell_\infty row/col norm) regularized least squares select a prescribed number of important columns/rows via bisection over the regularization strength, yielding deterministic, task-adaptive CUR decompositions particularly suitable for feature selection contexts (Linehan et al., 21 May 2025).

c) Randomized and Sketching-based Methods:

  • Fast CUR constructions employ randomized range-finding, dual-set sparsification, and adaptive importance sampling to select columns/rows efficiently, with provably (1+ϵ)(1+\epsilon) relative-error Frobenius norm guarantees (Wang et al., 2012, Ye et al., 2016, Ye et al., 2019).
  • Sketching enables memory- and computation-efficient GCUR via subspace embeddings and approximate regression (Ye et al., 2019, Ye et al., 2016).

d) Interlacing Polynomials and Spectral Norm Guarantees:

  • Generalized column/row subset selection can be controlled in the spectral norm by constructing a hierarchy of real-rooted polynomials with interlacing properties, yielding deterministic algorithms and the first tight spectral norm error bounds for GCUR and related problems (Cai et al., 2023, Cai et al., 7 Dec 2025).

e) Adaptive Cross/Oversampling:

  • Oversampling at the intersection of selected columns/rows adapts to local conditioning, stabilizing CUR decompositions in settings where access is costly or the canonical cross matrix is singular or ill-conditioned (Palkar et al., 25 Sep 2025).

f) Tensorial Extensions:

  • Chidori and fiber CUR decompositions generalize CUR to higher-order tensors, selecting fibers and mode-wise subarrays with efficient sampling, QR, and pseudoinversion steps (Cai et al., 2021).

3. Theoretical Error Bounds and Guarantees

Generalized CUR schemes achieve the following key theoretical guarantees:

  • Frobenius norm: For properly selected C,RC, R, and U=C+AR+U = C^+ A R^+, with c,rc, r moderately larger than target rank kk,

ACURF(1+ϵ)AAkF\|A - CUR\|_F \leq (1+\epsilon) \|A - A_k\|_F

with high probability under leverage-score/subspace sampling and adaptive randomized schemes (0708.3696, Wang et al., 2012, Ye et al., 2019, Ye et al., 2016).

  • Spectral norm: Recent advances via interlacing polynomials and generalized interlacing families furnish the first deterministic polynomial-time constructions with

ACUR2C(σ)σk+1\|A - CUR\|_2 \leq C(\mathbf{\sigma})\, \sigma_{k+1}

where C(σ)C(\mathbf{\sigma}) depends on the singular spectrum and may improve upon prior O(k2(tk)(dk))O(k^2(t-k)(d-k))-type bounds (Cai et al., 2023, Cai et al., 7 Dec 2025).

  • Joint approximations: In GCUR for pairs/triplets, the error for each matrix is bounded proportionally to the decay of their generalized singular values; when BB is invertible, the GCUR error for (A,B)(A,B) is closely related to that of applying standard CUR to AB1AB^{-1} (Gidisu et al., 2021, Gidisu et al., 2022).
  • Robustness: Adaptive oversampling and convex selection improve robustness against ill-conditioning and noise, with negligible increase in sample complexity or computational cost (Palkar et al., 25 Sep 2025, Linehan et al., 21 May 2025).

The following table summarizes the analytical guarantees:

CUR Variant Error Bound Norm Key Assumptions
Subspace sampling (1+ϵ)AAkF(1+\epsilon) \|A-A_k\|_F Frobenius Leverage-score sampling, c,r=O(klogk/ϵ2)c,r=O(k\log k/\epsilon^2)
Fast CUR / sketching (1+ϵ)AAkF(1+\epsilon) \|A-A_k\|_F Frobenius Randomized sketching, s=O(k/ϵ)s=O(k/\epsilon)
Interlacing polynomial C(σ)AAk2C(\sigma) \|A-A_k\|_2 Spectral Deterministic, interlacing roots
Convex-optimization No closed form, empirically optimal Frobenius Bisection on penalty/critical lambda

4. Applications and Empirical Impact

Generalized CUR models have shaped multiple applied and theoretical domains:

  • High-dimensional data analysis: CUR and GCUR enable interpretable feature selection in bioinformatics (gene/protein selection), unsupervised document analysis, and subgroup discovery, outperforming SVD when interpretability is essential (Linehan et al., 21 May 2025, Cao et al., 2023, Gidisu et al., 2021).
  • Robust recovery: Joint GCUR decompositions offer superior robustness for recovering signal in data perturbed by correlated or structured noise, with applications in single-cell RNA-seq, image analysis, and sensor fusion (Cao et al., 2023, Gidisu et al., 2022).
  • Reduced-order modeling: Adaptive, cross-oversampled CUR decompositions permit efficient low-rank truncation in time-dependent nonlinear stochastic PDEs, ensuring stability despite nonlinearities and governing physical constraints (Palkar et al., 25 Sep 2025).
  • Multi-view and contrastive learning: Generalized CUR is applicable to extracting discriminative features across multiple data modalities, e.g., selecting subsets in multiple views that maximize cross-correlation or discriminativity (Gidisu et al., 2022).
  • Tensor data summarization: Mode-wise tensor CUR decompositions afford fast, interpretable summarization of multidimensional data cubes (hyperspectral imaging, psychometrics) (Cai et al., 2021).
  • Core numerical linear algebra: Generalized CUR forms the algorithmic basis for fast low-rank preconditioners in Krylov subspace methods, data-efficient kernel approximations (Nyström method), and memory-constrained matrix factorizations.

In large-scale datasets (e.g., tens of thousands of genes/samples), recent algorithms achieve comparable or superior feature separation and interpretability to PCA/SVD, while permitting deterministic or user-driven cardinality control (Linehan et al., 21 May 2025).

Generalized CUR is part of a broader class of interpretable low-rank approximations including:

  • Nystroem method: CUR applied to SPSD matrices with R=CTR = C^T (Kędzierski, 6 Jun 2024).
  • Generalized Wedderburn reductions: Projection-based strategies for reducing matrix rank, underlying CUR and meta-factorizations (Kędzierski, 6 Jun 2024).
  • Generalized matrix regression: CUR as a special case of matrix regression with sketched subspace embeddings (Ye et al., 2019, Ye et al., 2016).
  • Randomized and adaptive cross-approximation (ACA): Superfast sequential or iterative refinement CUR schemes for streaming and memory-constrained environments (Luan et al., 2019).

This unified perspective has catalyzed new algorithmic reductions, e.g., CUR-inspired block LU, fast multipole, and kernel methods (Pan, 2016).

6. Open Directions and Limitations

While the unified GCUR framework has established deep connections between matrix approximation, polynomial theory, and statistical learning, open problems remain:

  • Spectral norm optimality: While advances in interlacing-polynomial methods have yielded the first spectral norm CUR guarantees, tight constants remain an area of active inquiry, especially in the generalized setting (Cai et al., 7 Dec 2025).
  • Deterministic versus randomization tradeoffs: Deterministic algorithms now match or surpass randomized methods in some regimes, but practical choices may depend on data properties such as coherence or noise structure (Cai et al., 2023, Cai et al., 7 Dec 2025).
  • Robustness under adversarial noise: Convex and interlacing-based GCUR variants demonstrate differing degrees of resilience to outliers and colored noise (Linehan et al., 21 May 2025, Cai et al., 2023).
  • Scaling to multimodal/tensorial data: While fast tensor CUR algorithms have been proposed, theoretical analyses of mode-wise versus coupled sampling and robustness are ongoing (Cai et al., 2021).
  • Augmentation for streaming or distributed settings: Superfast iterative, sketching, and blockwise CUR remain key for resource-constrained and ever-growing datasets (Luan et al., 2019).

A plausible implication is that further development of deterministic polynomial-time algorithms with spectral norm optimality under minimal singular minor assumptions will be central to the next generation of interpretable low-rank matrix approximations.

7. References to Key Primary Literature

These works collectively provide an authoritative theoretical and computational foundation for generalized CUR matrix approximation.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Generalized CUR Matrix Approximation.