Pigeonhole Stochastic Gradient Langevin Dynamics for Large Crossed Mixed Effects Models
Abstract: Large crossed mixed effects models with imbalanced structures and missing data pose major computational challenges for standard Bayesian posterior sampling algorithms, as the computational complexity is usually superlinear in the number of observations. We propose two efficient subset-based stochastic gradient MCMC algorithms for such crossed mixed effects models, which facilitate scalable inference on both the variance components and regression coefficients. The first algorithm is developed for balanced design without missing observations, where we leverage the closed-form expression of the precision matrix for the full data matrix. The second algorithm, which we call the pigeonhole stochastic gradient Langevin dynamics (PSGLD), is developed for both balanced and unbalanced designs with potentially a large proportion of missing observations. Our PSGLD algorithm imputes the latent crossed random effects by running short Markov chains and then samples the model parameters of variance components and regression coefficients at each MCMC iteration. We provide theoretical guarantees by showing the convergence of the output distribution from the proposed algorithms to the target non-log-concave posterior distribution. A variety of numerical experiments based on both synthetic and real data demonstrate that the proposed algorithms can significantly reduce the computational cost of the standard MCMC algorithms and better balance the approximation accuracy and computational efficiency.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.