Generalized CUR Matrix Approximation

Updated 10 December 2025

Generalized CUR matrix approximation is a framework that selects actual matrix rows and columns to construct interpretable low-rank approximations for single or multiple matrices.
It employs techniques such as DEIM, convex optimization, and randomized sampling to achieve accurate approximations with rigorous error bounds in both Frobenius and spectral norms.
The method extends to joint, adaptive, robust, and tensorial settings, supporting applications in bioinformatics, imaging, multi-view learning, and reduced-order modeling.

A generalized CUR matrix approximation is a broad class of algorithms and theoretical frameworks for approximating matrices (or collections of matrices) using actual subsets of columns and rows, thereby providing interpretable, low-rank surrogates. Unlike the classical CUR decomposition—which targets a single matrix—generalized CUR (GCUR) schemes encompass joint approximations of matrix pairs, triplets, or more general multiview settings, extensions involving additional convex or spectral constraints, as well as tensor analogues and robust, adaptive, or randomized methods. This synthesis reviews the foundational principles, error guarantees, algorithmic developments, and practical relevance of generalized CUR matrix approximation, with emphasis on the most influential developments of the last decade.

1. Generalized CUR: Conceptual Scope and Principal Models

Generalized CUR matrix approximation refers to constructions of the form

$A \approx C U R,$

where $A \in \mathbb{R}^{m \times n}$ is the data matrix, $C \in \mathbb{R}^{m \times c}$ collects selected columns, $R \in \mathbb{R}^{r \times n}$ collects selected rows, and $U \in \mathbb{R}^{c \times r}$ is a “core” linking block. In the classical setting, $U$ is taken as the Moore–Penrose pseudoinverse of the intersection $W = A_{I,J}$ :

$U = W^+ = (A_{I,J})^+.$

Generalizations arise when:

The approximation is extended to matrix pairs or triplets $(A, B)$ or $(A,B,G)$ with common column/row selections and aims to coordinate joint low-rank structure (Gidisu et al., 2021, Cao et al., 2023, Gidisu et al., 2022).
The selectors $(I, J)$ , or the block $U$ , may be determined by more complex criteria, such as surrogate convex optimization, DEIM-type greedy rules, randomized sketches, or interlacing-polynomial-based algorithms (Sorensen et al., 2014, Linehan et al., 21 May 2025, Cai et al., 7 Dec 2025).
The objective functions may deviate from the pure Frobenius norm to spectral norm, mixed-norm, or task-specific penalties, and the framework may include robust, adaptive, or structured variants (Cai et al., 2023, Palkar et al., 25 Sep 2025).
Tensorial generalizations consider mode-wise fibers and “multimode CUR” (Cai et al., 2021).

The table below summarizes the main types:

Generalization	Target(s)	Selection Mechanism	Error Objective
Matrix pair GCUR	$(A, B)$	DEIM / GSVD	Joint Frobenius/ $\ell_2$
CUR via convex optimization	$A$	Convex row/col penalties	Frobenius; feature selection
Randomized GCUR	$(A, B, G)$	Random sampling, DEIM	Frobenius
Interlacing spectral CUR	$A$ (or $(A, B, C)$ )	Interlacing polynomials	Spectral norm
CUR with adaptive oversampling	$A$	Adaptive QR, cross blocks	Frobenius/ $\ell_2$
Tensor CUR	$\mathcal{A}$	Mode-wise/fiberwise	Multiway Frobenius

2. Core Algorithms and Selection Principles

Generalized CUR approximations are constructed via several algorithmic paradigms:

a) Discrete Empirical Interpolation (DEIM) and Generalized SVD (GSVD):

The DEIM method greedily selects column and row indices maximizing the interpolation residual for the (generalized) left and right singular vectors, ensuring stable recovery and near-optimal interpolation (Sorensen et al., 2014, Cao et al., 2023, Gidisu et al., 2021).
For GCUR, DEIM is applied to the generalized singular vectors obtained from GSVD or restricted SVD of matrix pairs or triplets, aligning the selected indices for coordinated approximation (Gidisu et al., 2021, Gidisu et al., 2022).

b) Convex Optimization Approaches:

Convex-penalty (e.g., $\ell_\infty$ row/col norm) regularized least squares select a prescribed number of important columns/rows via bisection over the regularization strength, yielding deterministic, task-adaptive CUR decompositions particularly suitable for feature selection contexts (Linehan et al., 21 May 2025).

c) Randomized and Sketching-based Methods:

Fast CUR constructions employ randomized range-finding, dual-set sparsification, and adaptive importance sampling to select columns/rows efficiently, with provably $(1+\epsilon)$ relative-error Frobenius norm guarantees (Wang et al., 2012, Ye et al., 2016, Ye et al., 2019).
Sketching enables memory- and computation-efficient GCUR via subspace embeddings and approximate regression (Ye et al., 2019, Ye et al., 2016).

d) Interlacing Polynomials and Spectral Norm Guarantees:

Generalized column/row subset selection can be controlled in the spectral norm by constructing a hierarchy of real-rooted polynomials with interlacing properties, yielding deterministic algorithms and the first tight spectral norm error bounds for GCUR and related problems (Cai et al., 2023, Cai et al., 7 Dec 2025).

e) Adaptive Cross/Oversampling:

Oversampling at the intersection of selected columns/rows adapts to local conditioning, stabilizing CUR decompositions in settings where access is costly or the canonical cross matrix is singular or ill-conditioned (Palkar et al., 25 Sep 2025).

f) Tensorial Extensions:

Chidori and fiber CUR decompositions generalize CUR to higher-order tensors, selecting fibers and mode-wise subarrays with efficient sampling, QR, and pseudoinversion steps (Cai et al., 2021).

3. Theoretical Error Bounds and Guarantees

Generalized CUR schemes achieve the following key theoretical guarantees:

Frobenius norm: For properly selected $C, R$ , and $U = C^+ A R^+$ , with $c, r$ moderately larger than target rank $k$ ,

$\|A - CUR\|_F \leq (1+\epsilon) \|A - A_k\|_F$

with high probability under leverage-score/subspace sampling and adaptive randomized schemes (0708.3696, Wang et al., 2012, Ye et al., 2019, Ye et al., 2016).

Spectral norm: Recent advances via interlacing polynomials and generalized interlacing families furnish the first deterministic polynomial-time constructions with

$\|A - CUR\|_2 \leq C(\mathbf{\sigma})\, \sigma_{k+1}$

where $C(\mathbf{\sigma})$ depends on the singular spectrum and may improve upon prior $O(k^2(t-k)(d-k))$ -type bounds (Cai et al., 2023, Cai et al., 7 Dec 2025).

Joint approximations: In GCUR for pairs/triplets, the error for each matrix is bounded proportionally to the decay of their generalized singular values; when $B$ is invertible, the GCUR error for $(A,B)$ is closely related to that of applying standard CUR to $AB^{-1}$ (Gidisu et al., 2021, Gidisu et al., 2022).
Robustness: Adaptive oversampling and convex selection improve robustness against ill-conditioning and noise, with negligible increase in sample complexity or computational cost (Palkar et al., 25 Sep 2025, Linehan et al., 21 May 2025).

The following table summarizes the analytical guarantees:

CUR Variant	Error Bound	Norm	Key Assumptions
Subspace sampling	$(1+\epsilon) \\|A-A_k\\|_F$	Frobenius	Leverage-score sampling, $c,r=O(k\log k/\epsilon^2)$
Fast CUR / sketching	$(1+\epsilon) \\|A-A_k\\|_F$	Frobenius	Randomized sketching, $s=O(k/\epsilon)$
Interlacing polynomial	$C(\sigma) \\|A-A_k\\|_2$	Spectral	Deterministic, interlacing roots
Convex-optimization	No closed form, empirically optimal	Frobenius	Bisection on penalty/critical lambda

4. Applications and Empirical Impact

Generalized CUR models have shaped multiple applied and theoretical domains:

High-dimensional data analysis: CUR and GCUR enable interpretable feature selection in bioinformatics (gene/protein selection), unsupervised document analysis, and subgroup discovery, outperforming SVD when interpretability is essential (Linehan et al., 21 May 2025, Cao et al., 2023, Gidisu et al., 2021).
Robust recovery: Joint GCUR decompositions offer superior robustness for recovering signal in data perturbed by correlated or structured noise, with applications in single-cell RNA-seq, image analysis, and sensor fusion (Cao et al., 2023, Gidisu et al., 2022).
Reduced-order modeling: Adaptive, cross-oversampled CUR decompositions permit efficient low-rank truncation in time-dependent nonlinear stochastic PDEs, ensuring stability despite nonlinearities and governing physical constraints (Palkar et al., 25 Sep 2025).
Multi-view and contrastive learning: Generalized CUR is applicable to extracting discriminative features across multiple data modalities, e.g., selecting subsets in multiple views that maximize cross-correlation or discriminativity (Gidisu et al., 2022).
Tensor data summarization: Mode-wise tensor CUR decompositions afford fast, interpretable summarization of multidimensional data cubes (hyperspectral imaging, psychometrics) (Cai et al., 2021).
Core numerical linear algebra: Generalized CUR forms the algorithmic basis for fast low-rank preconditioners in Krylov subspace methods, data-efficient kernel approximations (Nyström method), and memory-constrained matrix factorizations.

In large-scale datasets (e.g., tens of thousands of genes/samples), recent algorithms achieve comparable or superior feature separation and interpretability to PCA/SVD, while permitting deterministic or user-driven cardinality control (Linehan et al., 21 May 2025).

Generalized CUR is part of a broader class of interpretable low-rank approximations including:

Nystroem method: CUR applied to SPSD matrices with $R = C^T$ (Kędzierski, 6 Jun 2024).
Generalized Wedderburn reductions: Projection-based strategies for reducing matrix rank, underlying CUR and meta-factorizations (Kędzierski, 6 Jun 2024).
Generalized matrix regression: CUR as a special case of matrix regression with sketched subspace embeddings (Ye et al., 2019, Ye et al., 2016).
Randomized and adaptive cross-approximation (ACA): Superfast sequential or iterative refinement CUR schemes for streaming and memory-constrained environments (Luan et al., 2019).

This unified perspective has catalyzed new algorithmic reductions, e.g., CUR-inspired block LU, fast multipole, and kernel methods (Pan, 2016).

6. Open Directions and Limitations

While the unified GCUR framework has established deep connections between matrix approximation, polynomial theory, and statistical learning, open problems remain:

Spectral norm optimality: While advances in interlacing-polynomial methods have yielded the first spectral norm CUR guarantees, tight constants remain an area of active inquiry, especially in the generalized setting (Cai et al., 7 Dec 2025).
Deterministic versus randomization tradeoffs: Deterministic algorithms now match or surpass randomized methods in some regimes, but practical choices may depend on data properties such as coherence or noise structure (Cai et al., 2023, Cai et al., 7 Dec 2025).
Robustness under adversarial noise: Convex and interlacing-based GCUR variants demonstrate differing degrees of resilience to outliers and colored noise (Linehan et al., 21 May 2025, Cai et al., 2023).
Scaling to multimodal/tensorial data: While fast tensor CUR algorithms have been proposed, theoretical analyses of mode-wise versus coupled sampling and robustness are ongoing (Cai et al., 2021).
Augmentation for streaming or distributed settings: Superfast iterative, sketching, and blockwise CUR remain key for resource-constrained and ever-growing datasets (Luan et al., 2019).

A plausible implication is that further development of deterministic polynomial-time algorithms with spectral norm optimality under minimal singular minor assumptions will be central to the next generation of interpretable low-rank matrix approximations.