Markov-Switching Mixed-Effects Hurdle Models
- Markov-switching mixed-effects hurdle models are a unified framework that separate latent disease-on/off regimes from structural zeros in overdispersed count data.
- The model integrates logistic regression for state transitions with autoregressive and endemic seasonal components to account for spatial and temporal heterogeneity.
- Bayesian MCMC and EM algorithm implementations provide robust estimation and superior predictive performance compared to traditional zero-inflated models.
Markov-switching mixed-effects hurdle models constitute a unified statistical framework designed to jointly model state-switching dynamics, overdispersed spatio-temporal count data, and the structural zero-generating process. These models are especially suited to epidemiological contexts, where disease presence and absence, subject to spatial and temporal covariates and overdispersion, must be distinctly characterized. The Markov-switching negative binomial hurdle model with mixed effects (“ZS-MSNBH model”) provides explicit representation of (i) latent disease-on/-off regimes, (ii) perfect-detection hurdle at zero, (iii) areal random effects, and (iv) autoregressive and endemic (seasonal and spatial) structure (Xu et al., 2023).
1. Model Structure and Key Components
The model posits a latent two-state Markov chain for each spatial unit (or time series) , with transitions governed by covariate-dependent transition probabilities. At each time point , the observed count is:
where denotes the zero-truncated negative binomial distribution. Thus, zeros arise strictly from the latent absence state (), and positive counts (with all zeros strictly excluded) emanate from the presence state (), enforcing a “hurdle” at zero.
Transition probabilities are specified through logistic regressions incorporating both covariates and spatial lag terms:
with encoding the probability of reemergence and persistence. Covariates may include population statistics, environmental measures, or socioeconomic indicators.
2. Hierarchical Regression: Autoregressive, Endemic, and Random Effects
The presence-state () count distribution relies on a structured mean formulation reflecting both short-term memory and long-term (seasonal/spatial) patterns:
- Autoregressive term: , with , enables area-specific short-term dynamics and covariate effects.
- Endemic term: , with , accounts for baseline differences (linked to population) and cyclical seasonality.
Random effects encode unobserved heterogeneity across spatial units.
Furthermore, overdispersion parameter may itself be modeled as a log-linear function of covariates and lagged counts:
This flexibility enables robust modeling of heteroskedasticity in the data.
3. Likelihood, Bayesian Specification, and Priors
The joint model likelihood combines (a) the Markov-switching latent process and (b) the hurdle count component:
- as above.
- is degenerate at zero for , and for .
Priors are typically weakly-informative Gaussian or uniform, e.g.
An uninformative prior initializes the latent state .
4. Estimation: Bayesian MCMC and Maximum Likelihood
The model admits both Bayesian and frequentist implementations.
- Bayesian inference: Posterior samples are obtained via Gibbs sampling, with Metropolis–Hastings updates for logistic and over-dispersion parameters, and sequential updates for (1) hidden states (forward-backward algorithms), (2) random effects, and (3) regression parameters. Convergence is assessed using the Gelman–Rubin statistic () and effective sample sizes.
- Maximum likelihood (EM algorithm): The latent switching sequence is treated as missing data. The E-step uses forward–backward recursion to compute , and the M-step maximizes the expected complete-data log-likelihood. The negative binomial GLM routines and logistic transition regression updates are executed iteratively (Xu et al., 2023).
Implementations are straightforward in probabilistic programming frameworks such as NIMBLE or JAGS.
5. Practical Application and Implementation Guidelines
Key recommendations for application include:
- Centering and scaling continuous covariates to improve MCMC mixing.
- Inclusion/testing of spatial lag terms to accommodate spatial autocorrelation.
- Monitoring persistence probabilities (posterior mean of ) to guard against excessive regime switching.
- Evaluating predictive adequacy via out-of-sample prediction, WAIC, or ranked probability scoring.
A hurdle at zero (in contrast to zero-inflated models) is appropriate when perfect-detection can be assumed, as in diseases with high reporting rates or analyses focusing on true extinction/persistence.
6. Comparison and Theoretical Motivation
The Markov-switching negative binomial hurdle model distinctly separates the mechanisms generating observed zeros (structural/no-disease) from those governing positive counts, further differentiating between persistence and reemergence transitions—each modulated by its own set of covariates and lag structure. This structure allows for (i) explicit modeling of “true absence” periods versus “disease-on” periods, and (ii) the accommodation of spatial, temporal, and areal heterogeneity in both disease presence and case counts.
Compared to zero-inflated alternatives, hurdle models enforce perfect detection in the “on” state, while zero-inflated specifications allow for under-reporting. Analysis of fit on real-world data indicates that both classes of Markov-switching models provide substantial improvements in out-of-sample prediction over non-switching hurdle and zero-inflated benchmarks; zero-inflated Markov-switching models may exhibit superior predictive performance in some cases (Xu et al., 2023).
7. Synthesis and Domain Relevance
Markov-switching mixed-effects hurdle models offer a comprehensive stochastic framework unifying regime-switching, perfect-detection structural zeros, multi-level random effects, autoregressive memory, and endemic-seasonal patterns for count data. Their utility in infectious disease epidemiology derives from their capacity to (a) model extinction, (b) capture heterogeneous reemergence/persistence dynamics, and (c) flexibly incorporate spatial and temporal covariate effects. This approach is particularly advantageous for high-resolution spatio-temporal surveillance and intervention evaluation in settings with strongly patterned incidence and extinction dynamics (Xu et al., 2023).