Papers
Topics
Authors
Recent
Search
2000 character limit reached

Sparse Convex Biclustering (SpaCoBi)

Updated 12 January 2026
  • Sparse Convex Biclustering (SpaCoBi) is a convex optimization–based method that integrates row/column fusion with group-lasso sparsity to uncover biclusters in high-dimensional data.
  • It leverages the Sylvester equation and ADMM for efficient optimization, ensuring global optimality and robustness against noise.
  • Empirical results on simulated and transcriptomic datasets show high adjusted Rand Index scores and effective feature selection compared to traditional methods.

Sparse Convex Biclustering (SpaCoBi) is a convex optimization–based method for simultaneous clustering of the rows and columns of high-dimensional data matrices, with integrated feature selection via group-lasso sparsity. SpaCoBi addresses limitations in existing biclustering approaches by directly penalizing noise in features, maintaining global optimality, and employing a stability-based criterion for hyperparameter tuning. Its design yields accurate and robust bicluster recovery in high-dimensional and large-scale applications, as demonstrated on simulated and transcriptomic datasets (Jiang et al., 5 Jan 2026).

1. Mathematical Formulation

Let XRn×pX \in \mathbb{R}^{n \times p} be a data matrix, where XiX_{i\cdot} denotes the iith row and xjx_j the jjth column. SpaCoBi fits a matrix ARn×pA \in \mathbb{R}^{n \times p}, simultaneously biclustering rows and columns while enforcing column-wise sparsity. The method solves the convex program: minARn×p12i=1nXiAi22+γ1i<jwijAiAj2+γ2k<w~kAkA2+γ3j=1pujAj2\min_{A\in\mathbb R^{n\times p}} \frac12\sum_{i=1}^n\|X_{i\cdot}-A_{i\cdot}\|_2^2 +\gamma_1\sum_{i<j}w_{ij}\|A_{i\cdot}-A_{j\cdot}\|_2 +\gamma_2\sum_{k<\ell}\tilde w_{k\ell}\|A_{\cdot k}-A_{\cdot\ell}\|_2 +\gamma_3\sum_{j=1}^p u_j\|A_{\cdot j}\|_2 where:

  • 12XAF2\frac12\|X-A\|_F^2 enforces data fidelity,
  • Row-fusion term: i<jwijAiAj2\sum_{i<j}w_{ij}\|A_{i\cdot}-A_{j\cdot}\|_2,
  • Column-fusion term: k<w~kAkA2\sum_{k<\ell}\tilde w_{k\ell}\|A_{\cdot k}-A_{\cdot\ell}\|_2,
  • Group-lasso column sparsity term: j=1pujAj2\sum_{j=1}^p u_j\|A_{\cdot j}\|_2.

Introducing auxiliary variables vijv_{ij} (row pairs), zkz_{k\ell} (column pairs), and gjg_j (column groups), SpaCoBi can be written in constrained form with convex objectives and linear constraints linking AA and the auxiliary variables.

2. Convexity and Optimization

Each objective term is convex: the quadratic loss is strictly convex in AA; fusion and group-lasso terms are convex norms. The global solution is unique.

Optimization proceeds via the Alternating Direction Method of Multipliers (ADMM) with the following major steps:

  1. AA-update (Sylvester equation): The matrix AA is updated by solving the Sylvester equation MA+AN=HMA + AN = H, with MM and NN constructed from row/column graph Laplacian structures and penalty parameters.
  2. Proximal updates for vijv_{ij}, zkz_{k\ell}, and gjg_j enforce respective fusions and sparsity via 2\ell_2-norm proximal operators.
  3. Dual variable updates for Lagrange multipliers ensure convergence.

Efficient solution of the Sylvester equation relies on Bartels–Stewart–type or modified Schur methods, exploiting the structure of MM and NN for computational gains. Multi-block ADMM convergence is guaranteed under mild conditions.

3. Stability-Based Tuning

SpaCoBi hyperparameters γ1\gamma_1 (row fusion), γ2\gamma_2 (column fusion), and γ3\gamma_3 (sparsity) control the tradeoff between fit, cluster granularity, and feature selection. To efficiently select these, SpaCoBi may collapse (γ1,γ2)(\gamma_1, \gamma_2) into a single γ\gamma and perform a grid search over (γ,γ3)(\gamma, \gamma_3).

Stability selection is employed: two bootstrap samples of rows yield biclustering solutions (ψ1,ψ2)(\psi_1, \psi_2) at a given (γ,γ3)(\gamma, \gamma_3). The clustering distance dFd_F is computed: dF(ψ1,ψ2)=Ex,yFI{ψ1(x)=ψ1(y)}I{ψ2(x)=ψ2(y)}d_F(\psi_1, \psi_2) = \mathbb E_{x, y \sim F} \left| I\{\psi_1(x) = \psi_1(y)\} - I\{\psi_2(x) = \psi_2(y)\} \right| Estimated from resampled pairs, the (γ,γ3)(\gamma, \gamma_3) minimizing dFd_F is selected for maximal stability.

4. Computational Complexity and Scaling

The per-iteration cost is dominated by the Sylvester solve in the AA-update. Naively, this requires O(n3+p3)O(n^3 + p^3) time, but this is mitigated by:

  • The Laplacian structure of MM and NN,
  • Fast generalized Schur/Bartels–Stewart algorithms.

Proximal updates for vv, zz, and gg scale with the size of row and column edge sets. Practical implementation uses mm-nearest neighbor graphs, rendering E1=O(mn)|\mathcal E_1| = O(mn) and E2=O(mp)|\mathcal E_2| = O(mp) for small mm.

A warm-starting strategy—using solutions from nearby parameter grid points as initializations—accelerates tuning by 20%20\%100%100\% on large-scale problems.

5. Empirical Results and Benchmarking

Simulation studies using synthetic checkerboard biclusters and known informative columns (ptruep_\mathrm{true}) demonstrate:

  • Adjusted Rand Index (ARI): SpaCoBi achieves mean ARI in [0.75,0.96][0.75, 0.96] versus Bi-ADMM L2L_2-norm [0.12,0.80][0.12, 0.80]; COBRA approaches zero in high noise.
  • Feature-selection: False Negative Rate =0=0–$0.07$, False Positive Rate =0.04=0.04–$0.27$, AUC =0.76=0.76–$0.90$. Bi-ADMM (without sparsity) FPR =1=1.

In mouse olfactory bulb (MOB) single-cell RNA-seq data (n=305n=305, p=1250p=1250) with known three-class structure:

  • SpaCoBi recovers clusters perfectly (ARI =1.0= 1.0) and selects marker genes such as Pbxip1, Pdlim2, and Isg15.
  • Bi-ADMM L2L_2-norm yields ARI =0.12= 0.12 and cannot suppress noise features.

Selection of fusion weights and group-lasso factors is critical:

  • Row/Column weights: mm-nearest neighbor Gaussian kernel,

wij=1{jNNm(i)}exp(ϕXiXj22)w_{ij} = \mathbf{1}\{j \in \mathrm{NN}_m(i)\} \exp(-\phi \|X_{i\cdot} - X_{j\cdot}\|_2^2)

with m=5m=5, ϕ=0.5\phi=0.5.

  • Group-lasso weights: Adaptive uj=1/aj(0)2u_j = 1 / \|a_j^{(0)}\|_2, using the solution with γ3=0\gamma_3=0, to penalize uninformative features.
  • Rescaling: Normalize {w}\{w\}, {w~}\{\tilde w\}, {u}\{u\} so that sums of parameters are 1/p1/\sqrt{p}, 1/n1/\sqrt{n}, 1/n1/\sqrt{n}, ensuring comparable tuning parameter magnitudes.

Limitations arise when both nn and pp are very large, as the Sylvester solve becomes a bottleneck; approximate methods or Block-ADMM provide possible remedies. Extensions to other sparsity norms (e.g., 1\ell_1) or overlapping-group penalties can be considered, provided convexity and ADMM compatibility are retained.

7. Context and Directions

SpaCoBi unifies row/column fusion and group-lasso feature selection within a convex optimization paradigm, ensuring global optimality and robust bicluster detection. Its empirical superiority over non-sparse convex biclustering is particularly pronounced in high-dimensional, noisy settings. Potential extensions include more general sparsity-inducing penalties and scalable iterative solvers, motivating ongoing research for truly massive omics applications (Jiang et al., 5 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sparse Convex Biclustering (SpaCoBi).