Dirac Spike-and-Slab Prior

Updated 17 January 2026

The Dirac spike-and-slab prior is a Bayesian mixture prior that combines a Dirac delta spike at zero with a continuous slab, promoting exact sparsity in model parameters.
Empirical Bayes techniques calibrate the mixing weight via marginal likelihood, with the choice of slab density (Laplace vs Cauchy) critically affecting posterior contraction rates.
The full posterior factorizes across coefficients, ensuring model selection consistency and adaptive regularization, with hierarchical models offering improved uncertainty quantification.

The Dirac spike-and-slab prior is a foundational Bayesian mixture prior designed to induce exact sparsity in parameter estimation, especially for high-dimensional models involving variable or function selection. It takes the form of a product mixture where the "spike" is a Dirac delta mass at zero, ensuring that some parameters are identically zero with positive probability, while the "slab" is a diffuse continuous density—typically Gaussian, Laplace, or Cauchy—capturing the nonzero coefficients. The empirical Bayes approach calibrates the key mixing weight via marginal maximum likelihood, and selection of slab density is central for achieving optimal posterior contraction rates. The full posterior under this prior is a product of discrete-continuous mixtures across model coefficients, with each coordinate governed by its inclusion probability and corresponding slab.

1. Mathematical Formulation and Model Structure

The canonical sparse normal-means model observes $X_i = \theta_i + \epsilon_i$ , with $\epsilon_i \sim N(0,1)$ for $i = 1, \dots, n$ and imposes the prior constraint $\theta_0 \in \ell_0[s_n]$ , i.e., at most $s_n$ nonzero entries in the mean vector. The Dirac spike-and-slab prior is specified as

$\Pi_\alpha = \bigotimes_{i=1}^n \left[ (1-\alpha)\, \delta_0 + \alpha\, G \right],$

where each $\theta_i$ is drawn independently from

$\pi(\theta_i | \alpha) = (1-\alpha)\, \delta_0(\theta_i) + \alpha\, g(\theta_i).$

Here, $\delta_0$ is the Dirac delta at $0$ (the spike), $g$ is a continuous slab density, and $\alpha \in [0,1]$ is the mixing proportion controlling expected sparsity. Under the Gaussian likelihood, the posterior remains in product form with closed expressions for inclusion probabilities and conditional slab densities: $\Pi_\alpha(d\theta|X) = \bigotimes_{i=1}^n \left[ (1 - a_\alpha(X_i))\,\delta_0 + a_\alpha(X_i)\, G_{X_i} \right],$ where

$a_\alpha(x) = \frac{\alpha\, g_X(x)}{(1-\alpha)\phi(x) + \alpha\, g_X(x)},$

with $\phi(x)$ the standard normal density and $g_X(x)$ its convolution with $g$ .

2. Empirical Bayes Calibration

Empirical Bayes estimation uses marginal maximum likelihood to select the sparsity parameter $\alpha$ . The marginal log-likelihood for $X$ is

$\ell_n(\alpha; X) = \sum_{i=1}^n \log\left( (1-\alpha)\phi(X_i) + \alpha\, g_X(X_i) \right).$

A threshold-based lower bound $\alpha_n$ (e.g., corresponding to $t(\alpha_n) = \sqrt{2\log n}$ ) restricts the maximization domain. The mixing parameter is then

$\hat{\alpha} = \arg\max_{\alpha \in [\alpha_n, 1]} \ell_n(\alpha; X).$

Plug-in posteriors $\Pi_{\hat{\alpha}}$ yield sparse estimates that adapt to the empirical complexity of the problem.

3. Convergence Rates and Slab Selection

Let $r_n = 2 s_n \log(n/s_n)$ denote the minimax $\ell_2$ -risk rate over $\ell_0[s_n]$ . The choice of slab density critically affects concentration:

Laplace slab ( $g(\theta) = \frac{1}{2} e^{-|\theta|}$ ) leads to suboptimal full posterior risk: $E_{\theta_0}\int \|\theta - \theta_0\|^2\, d\Pi_{\hat\alpha}(\theta | X) \gtrsim s_n \exp( \sqrt{\log(n/s_n)} ),$ far exceeding $r_n$ .
Cauchy slab ( $g(\theta) = \frac{1}{\pi}(1+\theta^2)^{-1}$ ) delivers optimal concentration: $\sup_{\theta_0 \in \ell_0[s_n]} E_{\theta_0} \int \|\theta - \theta_0\|^2\, d\Pi_{\hat\alpha}(\theta|X) \le C r_n.$ A plausible implication is that heavy-tailed slabs enable minimax contraction of the full posterior; Laplace-type slabs may only suffice for posterior mean or median estimates.

4. Structure of the Full Posterior Distribution

The posterior induced by a Dirac spike-and-slab prior, under normal likelihood, factors across coordinates. Each coordinate exhibits an exact-zero event with positive probability due to the spike, and, conditional on inclusion, exhibits posterior shrinkage around the observed data modulated by the slab density. This mixture structure guarantees model selection consistency and facilitates interpretation: inactive coefficients are exactly zero, while actives are adaptively regularized by the slab.

For empirical Bayes, the posterior mean and median may achieve minimax rates under Laplace slab, but the credible set diameter and second moment can be substantially inflated, underscoring the necessity of considering full posterior properties rather than summary statistics alone.

5. Slab Density Choice: Comparative Analysis

Slab Density	Slab Function	Posterior Rate
Laplace	$g(\theta) = \frac{1}{2} e^{-\|\theta\|}$	Suboptimal: $s_n \exp(\sqrt{\log(n/s_n)})$ (fails for full posterior risk)
Cauchy	$g(\theta) = \frac{1}{\pi}(1+\theta^2)^{-1}$	Optimal: $C r_n$ (minimax contraction for second moment)

The observed phenomena suggest that heavy-tailed slabs are essential for minimax posterior contraction and robust credible set construction. The identity of the slab function dictates whether the Bayesian procedure achieves optimal uncertainty quantification and coverage.

6. Hierarchical versus Plug-in Bayes and Complexity Penalization

While empirical Bayes using marginal MLE calibration of $\alpha$ can result in undersmoothing (especially for Laplace slabs) and suboptimal full posterior contraction, fully hierarchical Bayes—placing, e.g., a Beta prior on $\alpha$ —recovers minimax posterior concentration even with a Laplace slab. This reflects the role of hierarchical complexity penalty in controlling over-inclusion and oversmoothing. In practical terms, empirical Bayes credible balls for Laplace slabs cover well but are too large; hierarchical approaches temper this inflation by integrating over sparsity levels.

7. Practical and Theoretical Implications

Empirical analysis and theoretical results confirm that the Dirac spike-and-slab prior creates exact zeros in the posterior, supports consistent model selection, and when paired with heavy-tailed slabs and hierarchical calibration, yields minimax contraction rates for the entire posterior. The resulting Bayesian credible sets exhibit sharp separation between included and excluded variables. However, practitioners should avoid naive plug-in approaches with Laplace slabs for uncertainty quantification tasks; hierarchical formulations or heavy-tailed slabs are required for fully optimal inference.

References:

Castillo & Mismer, "Empirical Bayes analysis of spike and slab posterior distributions" (Castillo et al., 2018).

Markdown Report Issue Upgrade to Chat

References (1)

Empirical Bayes analysis of spike and slab posterior distributions (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Dirac Spike-and-Slab Prior.

Dirac Spike-and-Slab Prior

1. Mathematical Formulation and Model Structure

2. Empirical Bayes Calibration

3. Convergence Rates and Slab Selection

4. Structure of the Full Posterior Distribution

5. Slab Density Choice: Comparative Analysis

6. Hierarchical versus Plug-in Bayes and Complexity Penalization

7. Practical and Theoretical Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Dirac Spike-and-Slab Prior

1. Mathematical Formulation and Model Structure

2. Empirical Bayes Calibration

3. Convergence Rates and Slab Selection

4. Structure of the Full Posterior Distribution

5. Slab Density Choice: Comparative Analysis

6. Hierarchical versus Plug-in Bayes and Complexity Penalization

7. Practical and Theoretical Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research