Sparse-TDA via QR Pivoting

Updated 22 December 2025

The paper presents a sparse TDA approach that leverages pivoted QR factorization to select informative pixels and reduce feature dimensions efficiently.
It combines persistent homology, persistence images, and truncated SVD to convert complex topological data into actionable features for machine learning.
Empirical results show that Sparse-TDA significantly lowers computation time while maintaining competitive accuracy compared to kernel-based TDA methods.

Sparse-TDA via QR Pivoting is a method for sparse realization of topological data analysis (TDA) in supervised multi-way classification settings. It combines persistent topological feature extraction with principled column selection via pivoted QR factorization, enabling efficient, information-preserving dimension reduction for subsequent machine learning tasks. Key components include the vectorization of persistence diagrams (PDs) into persistence images (PIs), low-rank approximation via truncated SVD, and sparse pixel selection using QR pivoting over the PI domain. Benchmark evaluations indicate that Sparse-TDA maintains competitive accuracy with significant computational savings compared to kernel-based TDA approaches (Guo et al., 2017).

1. Persistent Topological Feature Vectorization

Persistent homology summarizes topological features of data through PDs, representing birth–death pairs $(x,y)$ corresponding to the lifetime of topological structures. To enable statistical learning, these diagrams are transformed into vector-valued features:

Linear Coordinate Transformation: Each PD $D_i\subset\R^2$ undergoes the map $T(x,y) = (x,\, y-x)$ , so coordinates represent birth time and persistence.
Persistence Surfaces: Define

$\rho_{D_i}(z) = \sum_{u\in T(D_i)} f(u)\,g_u(z),$

where $f(u)$ is a weighting (e.g., linear $u_y/u_y^*$ or nonlinear $\arctan(c\,u_y)$ ), and $g_u(z)$ is a Gaussian with variance $\sigma^2$ centered at $u$ .

Persistence Images (PI): $\rho_{D_i}$ is discretized on a $p \times p$ grid; each cell’s integral forms a component of the vector $\mathbf{x}_i \in \R^{d}$ with $d = p^2$ .
Data Matrix Assembly: The PI representations for $n$ samples form $X \in \R^{n\times d}$ .

This process yields a standardized, high-dimensional encoding of persistent topological information suitable for downstream processing.

2. Low-Rank Approximation via SVD

Due to the invariance and redundancy in the PI representations, the data matrix $X$ admits a low-rank structure:

Economy-Size SVD: Compute $X \approx U_r \Sigma_r V_r^T$ , where $U_r \in \R^{n \times r}$ , $V_r \in \R^{d \times r}$ , and $\Sigma_r$ holds the top-r singular values.
Rank Selection: The optimal $r$ is determined by the Gavish–Donoho rule: retain all $\sigma_j$ exceeding $4\sqrt{\beta}$ times the median ( $\beta = n/d$ ), providing a near-minimax approximation in the presence of Gaussian noise.

This step isolates the dominant variance directions, yielding a subspace with maximal topological signal representation.

3. QR Pivoting for Sparse Pixel Selection

Sparse-TDA achieves drastic reduction in feature dimensionality via column subset selection:

Pivoted QR Algorithm
1. Input $U_r \in \R^{n \times r}$ and target sparsity $s \le r$ .
2. Compute the pivoted QR of $U_r^T$ : $U_r^T P = Q R$ , with $P$ the permutation matrix determined by Businger–Golub norm-maximal pivoting.
3. The first $s$ pivots $\{\pi_1, \dots, \pi_s\}$ identify columns (pixels) most informative for the dominant subspace.
4. For each sample, PI vectors are restricted to these $s$ elements: $\tilde{\mathbf{x}}_i = \mathbf{x}_i[\pi_1:\pi_s]$ .

This mechanism provides a principled, data-driven way to select a subset of PI grid locations most critical for information retention.

4. Computational Complexity

The overall efficiency of Sparse-TDA is determined by:

Stage	Dominant Complexity	Remarks
Feature Extraction (PD→PI)	$O(n\,d\,m)$	$m$ : simplices per data item
Truncated SVD	$O(n\,d\,r)$	Efficient via randomized methods
Pivoted QR	$O(r^2 n)$	Businger–Golub algorithm
SVM Training	$\approx O(n^2 s)$	Kernel method, feature dim. $s$

The SVD and QR stages dominate the pipeline. Sparse-TDA delivers computational savings primarily by reducing the SVM input dimensionality, typically setting $s=r \ll d$ .

5. Multi-Way Classification Procedure

Once the sparse features are extracted, classification proceeds as follows:

Sparse Feature Assembly: All training PIs are projected to $\tilde{X} \in \R^{n \times s}$ via QR-selected indices.
RBF-kernel SVM Training: Use $\tilde{X}$ to train a one-versus-rest SVM; the RBF kernel is $K(\mathbf{u},\mathbf{v}) = \exp(-\gamma\|\mathbf{u}-\mathbf{v}\|^2)$ .
Parameter Selection: Grid search and cross-validation are performed for SVM $C$ and kernel width $\gamma$ .
Decision Function:

$f(\tilde{\mathbf{x}}) = \arg\max_{k\in \{1, \dots, K\}} \sum_{i: y_i = k} \alpha_i K(\tilde{\mathbf{x}}_i, \tilde{\mathbf{x}}) + b_k,$

where $\{\alpha_i, b_k\}$ are the learned dual and bias parameters.

This pipeline integrates topological feature selection and statistical classification within a unified, sparse framework.

6. Empirical Evaluation and Benchmarking

Sparse-TDA was validated on human posture and texture image classification problems:

SHREC’14 Synthetic: 15 classes, 300 meshes; Sparse-TDA (NW) accuracy: 94.0%, SVM runtime: 25.6s.
SHREC’14 Real: 400 meshes; Sparse-TDA (LW) accuracy: 68.8%, SVM runtime: 82.2s.
Outex Texture: 24 classes, 480 images; Sparse-TDA (NW) accuracy: 66.0%, SVM runtime: 120s.

Key findings:

Sparse-TDA achieves runtime reductions of one to two orders of magnitude compared to kernel-based TDA, with a modest loss in accuracy.
Sparse-TDA consistently outperforms L1-SVM in accuracy, often with equal or better training times.

Dataset	L1-SVM Accuracy (%)	Sparse-TDA (LW)	Sparse-TDA (NW)	Kernel-TDA
SHREC Synthetic	89.6	91.5	94.0	97.8
SHREC Real	63.9	68.8	67.8	65.3
Outex Texture	55.1	62.6	66.0	69.2

Dataset	L1-SVM Time (s)	Sparse-TDA	Kernel-TDA
SHREC Synthetic	35.4	25.6	1182
SHREC Real	305	82.2	92.3
Outex Texture	106	120	5457

Performance reflects mean ± std over multiple data splits; PI grid, SVD rank $r$ , and QR sample $s$ were dataset-specific ( $r \approx 100-130$ ; $s=r$ ) (Guo et al., 2017).

7. Theoretical Properties and Stability

Persistence Image Stability: The PD-to-PI transform is 1-Wasserstein stable: small input perturbations induce bounded changes in the resulting PI (Adams et al., 2017).
Optimal SVD Truncation: The Gavish–Donoho median threshold guarantees near-optimal rank selection for signal-plus-Gaussian-noise matrices.
QR Pivot Conditioning: The Businger–Golub strategy ensures the selected columns form a numerically well-conditioned basis. The associated interpolation error is bounded by the $(s+1)$ th pivot growth factor (Drmač & Gugercin, 2015).

These theoretical properties guarantee robust, stable, and near-minimax realization of topological feature sampling, ensuring generalization and numerical reliability across diverse datasets (Guo et al., 2017).

Markdown Report Issue Upgrade to Chat

References (1)

Sparse-TDA: Sparse Realization of Topological Data Analysis for Multi-Way Classification (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sparse-TDA via QR Pivoting.