BB-SSL: Bayesian Bootstrap Spike-and-Slab LASSO
- The paper demonstrates that BB-SSL combines Bayesian bootstrap techniques with spike-and-slab LASSO priors to enable scalable approximate posterior uncertainty quantification.
- It employs randomized MAP optimization with jittered priors and fast coordinate-descent to achieve theoretical contraction rates comparable to exact Bayesian methods.
- The methodology offers substantial computational efficiency, excels in parallel scalability, and outperforms traditional MCMC in high-dimensional sparse regression tasks.
The Bayesian Bootstrap Spike-and-Slab LASSO (BB-SSL) is an inferential methodology that combines Bayesian bootstrap techniques and spike-and-slab LASSO priors to enable scalable approximate posterior uncertainty quantification in high-dimensional sparse regression problems. By leveraging fast coordinate-descent optimization and random perturbations—both of the data and the prior—BB-SSL yields approximate posterior draws that achieve theoretical posterior contraction rates comparable to exact Bayesian inference while offering substantial computational benefits over traditional Markov chain Monte Carlo (MCMC) approaches (Nie et al., 2020).
1. Prior Construction: Spike-and-Slab LASSO and Jittered Priors
BB-SSL is fundamentally built upon the spike-and-slab LASSO (SSL) prior for linear regression. The model takes the form , , . The SSL prior for each specifies a two-component Laplace mixture with mixing probability : with and a Beta prior on .
BB-SSL introduces further flexibility through "jittered" priors, in which each coefficient is shrunken not towards zero, but towards a random location , sampled iid from the spike component: 0 The resulting prior is
1
where 2 and 3. This construction attenuates the tendency of standard weighted Bayesian bootstrap (WBB) approaches to collapse small effects exactly to zero (Nie et al., 2020).
2. Approximate Posterior Sampling via Reweighted MAP Optimization
BB-SSL employs randomized maximum a posteriori (MAP) optimization to generate approximate posterior draws. Each iteration consists of sampling data weights 4 from a Dirichlet distribution and random jitter 5 from the Laplace spike, then solving a penalized weighted regression problem:
- Sample 6 (total mass 7).
- Sample 8 for each 9.
- Form the pseudo-likelihood:
0
and the jittered prior 1.
- Compute the MAP estimate by maximizing
2
This reduces to the SSL coordinate-descent algorithm applied to reweighted (3, 4) data, followed by shifting the solution by 5.
Each replicate is independent, enabling straightforward parallelization. Optionally, the mixing weight 6 can be updated via its Beta full conditional.
3. Induced Pseudo-Posterior and Theoretical Contraction Rates
The distribution of BB-SSL draws can be characterized as the pushforward of the joint law 7 through the weighted MAP operator: 8 Under regularity conditions, these draws approximate the actual posterior 9.
For sparse normal means 0, 1, and weights 2 satisfying appropriate moment and tail conditions, BB-SSL achieves the minimax contraction rate 3 for the posterior mean squared error (Nie et al., 2020). Analogous results hold in high-dimensional regression (4), with contraction rate 5, under restricted eigenvalue and sparsity assumptions.
For multivariate regression, Bayesian bootstrap variants applied to the multivariate SSL (mSSL) yield contraction for the Frobenius and prediction errors at rates
6
with 7, 8 denoting the sparsities of the coefficient and precision matrices, respectively (Shen et al., 2022).
4. Computational Complexity and Scalability
The main computational cost for BB-SSL is the 9 replicates of coordinate-descent MAP optimization, each costing 0. Since each replicate is independent, BB-SSL is "embarrassingly parallel": the total cost scales as 1. For comparison, standard Gibbs samplers for SSL or MCMC-based approaches incur 2 or 3 cost per iteration, and may be serially correlated. Fast Gibbs routines for the horseshoe prior can achieve 4 or 5 per iteration, but still lack the parallelism and fail to match the computational efficiency of BB-SSL in the full regime (Nie et al., 2020).
5. Empirical Performance in Simulation and Real Data
BB-SSL closely matches the gold-standard stochastic search variable selection (SSVS) on posterior density estimation and marginal inclusion probabilities in both low- and high-dimensional scenarios. In low-dimensional regression (e.g., 6, 7, correlated blocks), BB-SSL tracks SSVS even for multimodal posteriors, in contrast to weighted Bayesian bootstrap (WBB) methods, which assign zero to many coefficients, and Skinny Gibbs, which underestimates posterior variance. On model selection, BB-SSL typically recovers 899% of the posterior mass, exceeding WBB and Skinny Gibbs.
In high-dimensional settings (9, 0), BB-SSL maintains robust posterior density estimates and credible intervals. Metrics such as Kullback-Leibler divergence, Jaccard distance, bias in posterior means, and Hamming distance favor BB-SSL or place it on par with Skinny Gibbs, with BB-SSL consistently outperforming WBB. In terms of effective sample size per wall-clock time, BB-SSL dominates, followed by Skinny Gibbs, fast MCMC, and WBB.
Real data analyses demonstrate that BB-SSL produces independent posterior samples at rates orders of magnitude faster than MCMC-based SSVS, with nearly identical marginal posterior densities and inclusion probabilities. For instance, in the Life-Cycle Savings dataset (1, 2), BB-SSL achieves an effective sample size of 3 vs. SSVS's 4 (Nie et al., 2020).
6. Practical Guidelines, Limitations, and Extensions
Recommended settings for BB-SSL include 5–6 perturbations for stable credible intervals, with a Dirichlet concentration parameter 7 satisfying the theoretical lower bound (8). A practical default for 9 is 0, which is calibrated to the noise level. The regularization parameter 1 should be selected to promote sparsity, with precomputed regularization paths reusable across all replicates.
BB-SSL assumes known noise variance 2, which must be specified or estimated via empirical Bayes. While BB-SSL provides posterior contraction rates and competitive empirical uncertainty quantification, it does not deliver exact frequentist coverage. Extensions to generalized linear models (GLMs) involve customized optimization but retain the same weighted-MAP framework. Open questions include the accuracy of high-dimensional posterior approximation (Bernstein–von Mises refinements) and integration with generative bootstrap schemes for further efficiency gains (Nie et al., 2020).
7. Comparison with Related Bayesian Bootstrap and Debiasing Methods
The Bayesian bootstrap overlay for the multivariate SSL (mSSL) operates via randomized MAP solvers with Gamma-distributed weights and optional random recentering, yielding interval estimates from empirical quantiles of resulting replicates. Simulation studies show that these Bayesian bootstrap intervals are substantially shorter yet achieve frequentist coverage close to nominal values when compared to asymptotic de-biasing intervals, which, though valid, are often 3–10 times longer (Shen et al., 2022). The empirical efficiency and scalability of BB-SSL and its multivariate versions suggest a strong practical advantage for high-dimensional sparse inference with uncertainty quantification in contemporary statistical workflows.