Product of Projections in Multi-View Subspace Clustering
- Product of projections in multi-view subspace clustering is a technique that replaces full affinity matrices with low-rank, landmark-based approximations to capture intrinsic subspace structures.
- SDSNet leverages encoder-decoder architectures and scalable self-expressiveness modules to efficiently process high-dimensional data while preserving subspace integrity.
- Empirical results show that these projection methods significantly reduce computational complexity and memory requirements, enabling rapid clustering of tens of thousands of samples.
Deep Scalable Subspace Clustering (SDSNet) refers to a class of deep learning frameworks that address the scalability bottlenecks inherent in traditional and deep subspace clustering by replacing full affinity or self-expressive matrices with scalable, often low-rank, approximations within end-to-end architectures. Unlike classical approaches that require full-batch optimization of affinity graphs for samples, SDSNet designs leverage mini-batch training, factorization schemes, and landmark-based approximations, enabling efficient clustering of tens of thousands of high-dimensional samples while retaining fidelity to subspace structure (Zhang et al., 2018, Mrabah et al., 24 Dec 2025).
1. Problem Formulation and Motivation
The objective of subspace clustering is to partition a dataset , whose columns are presumed to lie close to the union of (possibly non-linear) subspaces , into groups corresponding to those subspaces. Classical self-expressiveness models, notably Sparse Subspace Clustering (SSC) and Low-Rank Representation (LRR), express each data point as a linear combination of the others:
with as the coefficient (affinity) matrix and as the residual. Deep Subspace Clustering Networks (DSC-Net) (Ji et al., 2017) extend this model into a non-linear latent space via an auto-encoder, then employ a self-expressive layer with as the learnable 0 matrix.
However, these approaches entail 1 memory (for 2) and 3 or higher computational costs (for spectral clustering on 4), making them impractical for large 5 (Ji et al., 2017, Zhang et al., 2018, Mrabah et al., 24 Dec 2025). SDSNet was developed to eliminate this scaling bottleneck while preserving or improving clustering accuracy.
2. Architectural Innovations
SDSNet architectures share three critical design choices:
- Encoder-decoder backbone: A convolutional auto-encoder parametrizes the mapping 6 and 7. The encoder produces a latent embedding 8, with 9.
- Scalable self-expressiveness module:
- In (Zhang et al., 2018), each point 0 is assigned to one of 1 learnable subspaces 2 (with 3), avoiding the need for a global 4 affinity.
- In (Mrabah et al., 24 Dec 2025), the coefficient matrix 5 is replaced by a low-rank factorization 6 with 7 and 8. Instead of 9, SDSNet enforces 0 for a selected landmark matrix 1.
- Efficient clustering head: Spectral clustering is performed in the low-dimensional space spanned by the factor 2 or subspace assignment matrix 3, with all operations costing 4 when 5 are fixed.
A comparison of key self-expressive strategies appears below:
| Model | Self-Expressive Representation | Parameter Count |
|---|---|---|
| DSC-Net | 6 | 7 |
| SDSNet (2018) | 8, 9 | 0 |
| SDSNet (2025) | 1, 2 | 3 |
3. Optimization Objectives and Training Algorithms
SDSNet jointly optimizes auto-encoder reconstruction and subspace-preserving self-expressiveness losses:
4
5
6
Alternating minimization updates the assignment 7, subspaces 8 (via SVD or Grassmannian gradient), and network parameters by minibatch stochastic gradient descent (Adam).
9
Optimizing over 0 and 1 is achieved via Procrustes SVD or least squares, alternating with back-propagation for encoder/decoder parameters. Affinity 2 is block-diagonalized by optimizing representations during training.
4. Landmark-Based and Subspace-Assignment Approximations
A defining feature of SDSNet (Mrabah et al., 24 Dec 2025) is its replacement of the full self-expressiveness matrix 3 with a landmark-based factorization. 4 anchors are selected via 5-means++ or random sampling in latent space, leading to 6. The resulting 7 encodes how each data point relates to the landmarks, with 8. All affinity computations and subsequent spectral clustering occur in the 9-dimensional anchor space, yielding 0 computation and 1 memory.
In (Zhang et al., 2018), explicit assignment variables 2 simplify clustering to an assignment to one of 3 subspaces per point; subspaces are updated via SVD of assigned embeddings.
5. Computational Complexity and Scalability
SDSNet substantially alleviates the cubic and quadratic scaling issues of prior methods:
- Classical affinity-based deep subspace clustering: 4 memory and 5 time (building, storing, and decomposing 6) (Ji et al., 2017).
- SDSNet (Zhang et al., 2018): Each epoch comprises encoding 7 samples (8), assignments (9), and SVD/Grassmann updates (0), all linear when 1 and 2 are constant.
- SDSNet (Mrabah et al., 24 Dec 2025): Encoder/decoder updates (3), 4/5 updates (6), and final spectral clustering (7).
Empirically, SDSNet clusters 70,000 MNIST images in under 8 minutes using 82 GB GPU memory; equivalent affinity-based approaches require 937 GB just for storing the affinity matrix (Zhang et al., 2018). On synthetic datasets with 0 up to 10,000, computational time is observed to grow linearly (Mrabah et al., 24 Dec 2025).
6. Empirical Performance and Benchmarking
Extensive experiments demonstrate that SDSNet maintains or surpasses the clustering accuracy of both affinity-based and scalable state-of-the-art methods. On Fashion-MNIST (60,000 samples), SDSNet achieves ACC 1 63.8% and NMI 2 62.0% in under 10 minutes (Zhang et al., 2018). On YaleB, ORL, Coil100, UMIST, and Fashion-MNIST, (Mrabah et al., 24 Dec 2025) reports that SDSNet matches or exceeds alternatives (SSC-OMP, EnSC, SGL, S3C, LMVSC, SSSC, A-DSSC) by margins of 5–35% in ACC and NMI, with subspace-preserving error (SPE) comparable to dense affinity approaches.
Convergence is achieved rapidly, with the relative change in 4 or 5 dropping below 6 in fewer than 10 outer iterations (Mrabah et al., 24 Dec 2025); affinity matrices display clear block-diagonal structure early in training.
7. Implementation Strategies and Hyperparameters
Robust performance requires careful tuning and implementation choices:
- Hyperparameters: Number of subspaces 7 is set to the true number of clusters (e.g., 8 for MNIST), embedding dimension 9 typically 20, subspace dimension 0 in [7, 11], anchor number 1 (proportional to 2), loss weightings 3, batch size 4, optimizer Adam (5 lr), AE pre-training 6 epochs, joint training 7–8 epochs (Zhang et al., 2018).
- Subspace updates: Either top-9 SVD on assignments (discarding 10% outliers) or Riemannian updates (Grassmann retraction).
- Landmark selection: 00-means++ improves anchor robustness over random, especially as 01 approaches 02.
- Initialization and batch scheduling: Pre-training the auto-encoder facilitates stable convergence; initializing subspaces/anchors via 03-means or shallow DSC improves accuracy and speed.
SDSNet’s design establishes a paradigm shift for deep subspace clustering, yielding highly scalable, accurate solutions across large-scale, high-dimensional datasets by circumventing the full affinity bottleneck and optimizing computational paths through architectural and algorithmic innovations (Zhang et al., 2018, Mrabah et al., 24 Dec 2025).