Variable Basis Mapping (VBM)
- VBM is a penalized basis learning methodology that extends sparse multiclass LDA by incorporating per-variable ordinal weights to ensure order-concordant variable selection.
- It employs a two-step Kendall’s Tau procedure to construct ordinal weights, screening noise and enforcing monotonicity in group means for reliable variable selection.
- An efficient block-coordinate descent algorithm optimizes the VBM objective, yielding interpretable, sparse discriminative bases even when p greatly exceeds N.
Variable Basis Mapping (VBM) refers to a penalized basis learning methodology for high-dimensional ordinal classification problems. The VBM framework, as developed in Kim et al. (2024), extends sparse multiclass linear discriminant analysis (LDA) by introducing per-variable ordinal weights and a weighted group-lasso penalty, thereby enabling the selection of variables that exhibit both discriminative and order-concordant behavior with respect to the ordinal response. VBM is designed for regimes with high-dimensional feature spaces (), where interpretability and variable selection are critical.
1. Formulation of the Ordinal-Weighted Sparse Basis Learning Problem
Let denote a feature vector and the ordinal class label. The VBM method assumes a common-covariance Gaussian model: Standard multiclass LDA seeks a -dimensional basis that maximizes separation between groups. The unpenalized estimator is: where is the (pooled) within-group covariance and determines the between-group means.
To promote sparsity and preferentially select order-concordant variables, VBM introduces the following penalized objective with per-variable ordinal weights : where sets overall sparsity and amplifies penalization for variables with smaller (Kim et al., 2022).
2. Construction of Ordinal Weights via Two-Step Kendall’s Tau Procedure
Variable selection is guided by ordinal weights , constructed via a two-step process leveraging Kendall’s tau statistics:
- Global Kendall’s Tau (): Measures correlation between and across all samples.
- Group-Mean Kendall’s Tau (): Assesses monotonicity of group means.
For thresholds , variable is assigned
Step 1 eliminates “noise” variables; Step 2 detects variables whose class means are strictly monotone. Theorems guarantee that this rule selects true order-concordant variables with high probability under mild assumptions (Kim et al., 2022).
3. Optimization via Block-Coordinate Descent
The minimization of the VBM objective is efficiently solved by block-coordinate descent, exploiting the group-lasso structure. For each row in , the algorithm updates as follows:
- Compute the partial-residual vector:
- Set and .
- Update row :
The procedure converges to the global optimum due to the convexity and block-separability of the objective.
4. Theoretical Guarantees in High-Dimensional Regimes
VBM exhibits non-asymptotic oracle properties under high-dimensional scaling. Key sets include:
- Discriminant variables
- Ordinal variables
- Ordinal-discriminant
Selection consistency is achieved when tuning parameters are chosen appropriately:
- For moderate , all discriminative variables are selected.
- For large , only variables that are both discriminative and order-concordant are selected.
Estimation bounds (in ) are provided: in probability, where is a compatibility constant. High-dimensional consistency requires , sufficiently slowly, and (Kim et al., 2022).
5. Post-Screening and Data-Adaptive Refinement
In practical applications, data-adaptive thresholding is deployed. The two-step weights can be refined by:
- Initial screening using ANOVA F-tests to screen noise from variables with nontrivial mean differences.
- Adaptive selection of and based on empirical distributions of .
- This adaptive procedure maintains strict separation between order-concordant and non-monotone variables.
6. Interpretability and Sparsity of the Learned Representation
The group-lasso penalty in VBM leads to row-sparsity in the learned basis: only a small subset of variables contributes to the -dimensional discriminant subspace. Variables with monotonic class means under incur less regularization () and are preferentially retained when , while non-monotone or noisy features are heavily penalized and typically excluded.
Each selected variable corresponds to a row in and can be directly mapped to interpretable patterns of monotone group-mean shifts in the projected subspace. This facilitates intelligible variable selection, particularly useful in domains such as genomics, where the interpretability of the selected genes is paramount.
Practical results include:
- In low-dimensional synthetic settings, VBM recovers the true set under suitable .
- In large-scale gene expression datasets, VBM selects a highly sparse subset (typically 7–20 out of >10,000 genes) while maintaining competitive or superior classification error rates compared to nominal LDA or ordinal logistic regression (Kim et al., 2022).