Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bayesian Spike-and-Slab Framework

Updated 1 February 2026
  • Bayesian Spike-and-Slab Framework is a hierarchical model that combines a sharp 'spike' at zero with a diffuse 'slab' to represent coefficients for sparse estimation.
  • It enables effective variable selection and uncertainty quantification in high-dimensional regression by inducing a multimodal posterior over sparse supports.
  • The framework leverages restricted isometry properties and efficient rejection sampling techniques to ensure accurate, scalable posterior sampling with formal computational guarantees.

A Bayesian Spike-and-Slab Framework is a hierarchical probabilistic modeling approach for sparse estimation, variable selection, and uncertainty quantification in high-dimensional inference problems. The framework combines a "spike" component (typically a point mass at zero or a sharply peaked continuous density) with a "slab" component (a diffuse or heavy-tailed density) as a prior distribution over model coefficients, supports, or structural parameters. This yields a multimodal posterior that encodes combinatorial uncertainty over sparsity patterns and continuous uncertainty over effect sizes, with formal guarantees and efficient computational algorithms now available for exact posterior sampling in regimes previously inaccessible to traditional methods (Kumar et al., 4 Mar 2025).

1. Model Specification and Prior Construction

Bayesian spike-and-slab regression is defined via observations y=Xθ+ϵy = X\,\theta + \epsilon, with ϵN(0,σ2In)\epsilon \sim N(0,\sigma^2 I_n), XRn×dX \in \mathbb{R}^{n \times d}, and an unknown sparse θRd\theta \in \mathbb{R}^d. The canonical spike-and-slab prior is

π(θ)=i=1d[(1αi)δ0+αiμ]\pi(\theta) = \bigotimes_{i=1}^d \Big[(1-\alpha_i)\,\delta_0 + \alpha_i\,\mu\Big]

where each coordinate is zero ("spike") with probability 1αi1-\alpha_i or drawn i.i.d. from the slab density μ\mu (e.g., N(0,1)N(0,1) or Laplace) with probability αi\alpha_i (Kumar et al., 4 Mar 2025). This induces a prior over supports S=supp(θ)S = \mathrm{supp}(\theta): P(S)=iSαiiS(1αi),θSSμS,  θSc=0P(S) = \prod_{i \in S}\alpha_i \prod_{i \notin S}(1-\alpha_i), \qquad \theta_S \mid S \sim \mu^{\otimes |S|},\; \theta_{S^c} = 0 The prior is thus both discrete (over possible supports) and continuous (over effect sizes in the slab).

2. Posterior Characterization and Analytic Structure

Under Gaussian noise and a Gaussian slab, the posterior takes the form: π(θX,y)exp(yXθ22σ2)π(θ)\pi(\theta|X,y) \propto \exp\left(-\frac{\|y-X\theta\|^2}{2\sigma^2}\right) \pi(\theta) In the Gaussian-slab case, one obtains an explicit mixture representation: π(θy)=S[d]w(S)N(μS,ΣS)1supp(θ)S\pi(\theta|y) = \sum_{S \subset [d]} w(S)\,\mathcal{N}(\mu_S, \Sigma_S) \cdot 1_{\mathrm{supp}(\theta) \subseteq S} with explicit formulas for the mixture means, covariances, and weights (Kumar et al., 4 Mar 2025): ΣS=(XSXS/σ2+IS)1,μS=ΣS(XSy/σ2)\Sigma_S = \left(X_S^\top X_S/\sigma^2 + I_S\right)^{-1}, \qquad \mu_S = \Sigma_S \left(X_S^\top y/\sigma^2\right)

w(S)iSαi1αidet(ΣS)1/2exp(12μSΣSμS)w(S) \propto \prod_{i \in S}\frac{\alpha_i}{1-\alpha_i} \cdot \det(\Sigma_S)^{-1/2} \cdot \exp\left(\frac{1}{2}\mu_S^\top \Sigma_S \mu_S\right)

For Laplace slab, closed-form Gaussian integrals are lost, but the same underlying mixture structure persists, subject to numerical integration (Kumar et al., 4 Mar 2025).

3. Sampling-Complexity, Restricted Isometry and Statistical Regimes

A key insight is that posterior sampling, in order to be accurate and tractable in high dimensions, demands that XX satisfy a restricted isometry property (RIP) up to sparsity k=O(k+log(1/δ))k^\star = O(k + \log(1/\delta)), where kk is the expected sparsity and δ\delta is total variation error tolerance (Kumar et al., 4 Mar 2025). For Gaussian XN(0,1/n)X \sim N(0,1/n),

  • Polynomial-time, high-accuracy sampler: nCk3polylog(d)n \geq C k^3 \cdot \mathrm{polylog}(d) achieves TV error δ\leq \delta in O(n2d1.5polylog(d/(δmin{1,σ})))O(n^2 d^{1.5} \mathrm{polylog}(d/( \delta \min\{1,\sigma\}))) operations.
  • Near-linear-time sampler: nCk5polylog(d)n \geq C k^5 \cdot \mathrm{polylog}(d) achieves the same accuracy in O~(ndlog(1/min{1,σ}))\widetilde{O}(nd \log(1/\min\{1, \sigma\} )) (Kumar et al., 4 Mar 2025).

RIP suffices for standard random matrix ensembles (sub-Gaussian, subsampled Fourier, etc.), requiring only a sublinear scaling of samples in the dimension dd. These bounds break past barriers of strictly linear sample regimes or strong SNR assumptions common in previous literature.

4. Algorithmic Advances: Hint-Vector Estimation and Product Proposals

The sampling framework proceeds in two stages:

  1. Hint-vector estimation: Fast sparse-recovery (\ell_\infty or 2\ell_2-based) yields an initial estimate θ^\hat{\theta} with support Tsupp(θ)T \approx \mathrm{supp}(\theta^\star). With high posterior probability, θ^S\hat{\theta} \subseteq S for sample support SS and θ^θ\|\hat{\theta} - \theta\| small (Proposition 2.8).
  2. Coordinatewise product-proposal + rejection sampling: Using "recentering" (Lemma 3.3), weights over supports STS \supset T become simple coordinatewise products. A conditional Poisson product distribution μ\mu over SS can be sampled in O(dk)O(d k^\star) time and is provably close (within constant ratios, Lemma 3.7) to the true posterior mass, allowing TV-accurate rejection sampling to the true posterior support SS (Kumar et al., 4 Mar 2025).

Once support SS is drawn, sampling θSN(μS,ΣS)\theta_S \sim N(\mu_S, \Sigma_S) is exact (Lemma 3.11).

5. Provable Posterior Guarantees, Sparsity, and Estimation-to-Sampling Results

The main theorem (Kumar et al., 4 Mar 2025): Under stipulated RIP and sample size,

law(θ~)π(X,y)TVδ\|\,\mathrm{law}(\tilde{\theta}) - \pi(\cdot\,|\,X, y)\,\|_{TV} \leq \delta

The computational cost is as above, and rejection sampling mixes efficiently over the posterior support (mixing cost O(poly(C)log(1/δ))O(\mathrm{poly}(C)\log(1/\delta))). The near-linear-time sampler requires k5k^5 samples and achieves comparable accuracy.

Structural lemmas include:

  • Support-sparsity (Corollary 2.10): For any product prior, π[supp(θ)6k+O(log(1/δ))]1δ\pi[\mathrm{supp}(\theta) \leq 6k + O(\log(1/\delta))] \geq 1-\delta.
  • Posterior-to-estimation (Proposition 2.8): Any estimator θ^\hat{\theta} that is good in metric m(,)m(\cdot,\cdot) with high probability induces a sampling procedure that, with probability 12δ\geq 1-2\delta, draws π\pi-samples within double the estimation error in mm.

6. Extension to Laplace Slabs

For slab μ(x)exp(x)\mu(x) \propto \exp(-|x|), the Gaussian mixture structure lacks closed-form integrals. The algorithm adapts by:

  • Using Monte Carlo or annealing-based normalizer estimation for each mixture component (Prop 4.1, Cor 4.6) to accuracy (1±Δ)(1 \pm \Delta) in O(k4/Δ2polylog(kR/Δ))O(k^4/\Delta^2 \cdot \mathrm{polylog}(k R/\Delta)).
  • Restricting to σ=O(1/k)\sigma = O(1/k) to control Laplace tail errors (Lemma 4.2).

Theorem 1.3 (Cor 4.14): For μ=Lap(0,1)\mu = \mathrm{Lap}(0,1), σ=O(1/(k+log(1/δ)))\sigma = O(1/(k+\log(1/\delta))), nCk3polylog(d/δ)n \geq C k^3\, \mathrm{polylog}(d/\delta), sample in O(n2d1.5polylog(d/(σδ))+k4/δ2polylog(d/(σδ)))O(n^2 d^{1.5}\mathrm{polylog}(d/(\sigma \delta)) + k^4/\delta^2 \mathrm{polylog}(d/(\sigma \delta))) time with TVδ\leq \delta. Near-linear-time algorithms analogous to the Gaussian case hold for nk5polylog(d)n \geq k^5\, \mathrm{polylog}(d) (Kumar et al., 4 Mar 2025).

7. Technical and Applied Impact

This framework supplies the first polynomial-time, sublinear-measurement, provably exact samplers for spike-and-slab posteriors in high-dimensional sparse linear regression, valid for all SNRs, with flexible extension to Laplace diffuse priors (Kumar et al., 4 Mar 2025). The approach unifies continuous algorithms, precise RIP-based concentration, and rejection-sampling with conditional-Poisson tractable supports for rigorous total variation-accuracy and running-time guarantees.

The framework establishes spike-and-slab posterior sampling—with certifiable accuracy and scalable computation—as the theoretical and practical gold standard for Bayesian sparse regression in high dimensions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bayesian Spike-and-Slab Framework.