Papers
Topics
Authors
Recent
Search
2000 character limit reached

Spike-and-Slab Changepoint Selection

Updated 18 April 2026
  • The paper demonstrates that spike-and-slab priors enable accurate detection and localization of changepoints with consistency guarantees under varying noise conditions.
  • Methodologies such as joint Gibbs sampling and marginal solo approaches balance computational tractability with robust statistical inference in both univariate and dynamic regression settings.
  • Empirical results show that the spike-and-slab framework outperforms frequentist methods in accuracy and scalability, especially in high-dimensional and heavy-tailed noise scenarios.

Spike-and-slab priors provide a principled Bayesian framework for selecting changepoints in time series and regression models by distinguishing between signal and noise via a binary latent indicator structure. These methodologies enable consistent estimation of both the number and locations of changepoints and are notable for balancing computational tractability, model flexibility, and robust statistical guarantees under a broad range of noise conditions.

1. Model Foundations and Spike-and-Slab Priors

The canonical setting for Bayesian changepoint analysis with spike-and-slab priors assumes a univariate time series: Yt=ft+ϵt,t=1,…,T,Y_t = f_t + \epsilon_t,\qquad t=1,\dots,T, where ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2). The signal ftf_t is modeled as piecewise constant, with KK unknown changepoints C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}. Segment means {μk}\{\mu_k\} define ftf_t, and changepoints correspond to jumps in the increments Δft=ft−ft−1\Delta f_t = f_t - f_{t-1}. An auxiliary binary sequence {Zt}\{Z_t\} encodes changepoint structure: Zt={1,Δft≠0 0,Δft≈0Z_t = \begin{cases} 1, & \Delta f_t \ne 0 \ 0, & \Delta f_t \approx 0 \end{cases} Spike-and-slab priors are placed on ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)0 via these indicators:

  • ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)1, with ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)2.
  • Conditional on ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)3:

ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)4

ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)5

where ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)6 and ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)7 (e.g., ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)8).

The spike imposes strong shrinkage to zero (favoring continuity), while the slab allows for large jumps (detecting changepoints) (Cappello et al., 2021).

For regression settings, a dynamic spike-and-slab prior can be hierarchically placed on time-varying regression coefficients, with binary indicators ϵt∼iidN(0,σ2)\epsilon_t \overset{\rm iid}{\sim} N(0, \sigma^2)9 controlling switching between spike and slab regimes on each coefficient at each time (Uribe et al., 2020).

2. Posterior Inference Algorithms

Two primary algorithmic frameworks have been proposed:

A. Joint Gibbs Sampling (basad.cp)

The MCMC strategy jointly samples from the posterior:

  • At each iteration, ftf_t0 and ftf_t1 are updated for all ftf_t2.
  • Provides high accuracy but suffers from slow mixing as ftf_t3 grows, with computational costs scaling as ftf_t4.

B. Marginal Solo Spike-and-Slab (solo.cp)

This approach considers one candidate changepoint ftf_t5 at a time:

  • A spike-and-slab prior is placed on ftf_t6; all other increments receive conjugate Gaussian priors.
  • Marginalization over nuisance parameters leads to a closed-form two-component Gaussian mixture for the marginal posterior of ftf_t7:

ftf_t8

  • The posterior inclusion probability ftf_t9 is directly computed using mixture weights and prior probabilities.
  • The solo.cp algorithm entails a forward recursion shared by all KK0, and a KK1 backward pass per candidate, with total cost KK2 and no reliance on MCMC.

For dynamic linear models, a state-space structure is employed, and posterior inference is performed using a Gibbs sampler that alternates between Forward-Filtering Backward-Sampling (FFBS) for coefficients and block updates of spike/slab indicators via schemes such as the Gerlach–Carter–Kohn algorithm for efficiency (Uribe et al., 2020).

3. Estimation of Changepoints and Model Selection

The spike-and-slab formalism enables summary statistics for changepoint selection:

  • Posterior inclusion probabilities KK3 are computed.
  • A median-probability model selects raw changepoint candidates as KK4.
  • To mitigate spurious detection of consecutive changepoints, clustering of close indices (within a user-specified KK5) is performed, retaining only the member with the highest KK6 in each cluster.
  • The estimated changepoint set and number is KK7 and KK8.

For regression models, the posterior probability of regime change KK9 summarizes the likelihood of a changepoint in the shrinkage regime for each coefficient and time index (Uribe et al., 2020).

4. Theoretical Guarantees

Rigorous model selection properties are established under specific asymptotic regimes:

  • If the true minimal jump size C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}0 and minimal spacing between changepoints are large (C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}1; C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}2), and hyperparameters are scaled as C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}3, C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}4, C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}5, then the MAP estimator achieves:

C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}6

  • This result guarantees consistency and near-optimal localization rate C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}7 (Cappello et al., 2021).

Single-changepoint regimes require weaker signal-to-noise: C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}8 suffices for near-optimal location accuracy of C∗={η1,…,ηK}\mathcal{C}^* = \{\eta_1,\dots,\eta_K\}9.

5. Computational Complexity and Scalability

The solo.cp algorithm achieves significant gains in scalability over traditional Bayesian MCMC:

  • Its total runtime is {μk}\{\mu_k\}0 for {μk}\{\mu_k\}1 timepoints, with no sampling or mixing concerns; implementations with {μk}\{\mu_k\}2 in the low thousands complete in seconds to minutes on a single CPU.
  • The basad.cp variant requires runs of Gibbs sampling, costing {μk}\{\mu_k\}3 and may require multiple hours for {μk}\{\mu_k\}4 (Cappello et al., 2021).

For dynamic models with {μk}\{\mu_k\}5 regressors, one FFBS pass is {μk}\{\mu_k\}6 (under a diagonal variance assumption), and updates for {μk}\{\mu_k\}7 and {μk}\{\mu_k\}8 can be parallelized or vectorized, resulting in overall per-sweep cost {μk}\{\mu_k\}9 (Uribe et al., 2020).

6. Empirical Assessment and Robustness

Empirical benchmarks validate the spike-and-slab changepoint selection framework:

  • On simulated signals (BLOCKS, ftf_t0, ftf_t1; TEETH, ftf_t2, ftf_t3) and under various noise models (Gaussian, Laplace, ftf_t4, mixture-Gaussian), the solo.cp method matches or exceeds the accuracy of state-of-the-art frequentist approaches (wbs, smuce, pelt, r-fpop), especially under heavy-tailed or contaminated noise.
  • Frequentist methods typically overestimate ftf_t5 in the presence of outliers, while spike-and-slab approaches enforce parsimony.
  • On real data applications such as aCGH microarray and ion-channel recordings, Bayesian spike-and-slab methods yield plausible, interpretable segmentation while being less sensitive to hyperparameter tuning than frequentist benchmarks (Cappello et al., 2021).

7. Extension to Dynamic Regression and Markov Switching

In dynamic linear regression, spike-and-slab priors capture time-varying sparsity of regression coefficients:

  • Binary indicators ftf_t6, generated from two-state Markov chains, allow regime switching in the shrinkage variance of coefficients, enabling block-structured or persistent sparsity patterns.
  • Full conditional updates and forward-backward sampling algorithms (e.g., Gerlach–Carter–Kohn) permit efficient posterior computation and flexible changepoint analysis across multiple predictors and time (Uribe et al., 2020).

This formalism extends changepoint concepts to high-dimensional, temporally structured variable selection, systematically connecting classical changepoint analysis, modern Bayesian regression, and dynamic sparsity modeling.


For comprehensive methodological exposition, theoretical guarantees, and empirical details, see (Cappello et al., 2021) and (Uribe et al., 2020).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Spike-and-Slab for Changepoint Selection.