Pochhammer Priors for Sparse Count Models
Abstract: Bayesian hierarchical models are commonly employed for inference in count datasets, as they account for multiple levels of variation by incorporating prior distributions for parameters at different levels. Examples include Beta-Binomial, Negative-Binomial (NB), Dirichlet-Multinomial (DM) distributions. In this paper, we address two crucial challenges that arise in various Bayesian count models: inference for the concentration parameter in the ratio of Gamma functions and the inability of these models to effectively handle excessive zeros and small nonzero counts. We propose a novel class of prior distributions that facilitates conjugate updating of the concentration parameter in Gamma ratios, enabling full Bayesian inference for the aforementioned count distributions. We use DM models as our running examples. Our methodology leverages fast residue computation and admits closed-form posterior moments. Additionally, we recommend a default horseshoe type prior which has a heavy tail and substantial mass around zero. It admits continuous shrinkage, making the posterior highly adaptable to sparsity or quasi-sparsity in the data. Furthermore, we offer insights and potential generalizations to other count models facing the two challenges. We demonstrate the usefulness of our approach on both simulated examples and on real-world applications. Finally, we conclude with directions for future research.
- Generalized beta mixtures of gaussians. Advances in neural information processing systems, 24.
- Generalized double pareto shrinkage. Statistica Sinica, 23(1):119.
- Normal variance-mean mixtures and z distributions. International Statistical Review/Revue Internationale de Statistique, pages 145–159.
- Overall objective priors. Bayesian Analysis, 10(1):189–221.
- Dirichlet–laplace priors for optimal shrinkage. Journal of the American Statistical Association, 110(512):1479–1490.
- Latent Dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022.
- The horseshoe estimator for sparse signals. Biometrika, 97(2):465–480.
- Crane, H. (2016). The ubiquitous Ewens sampling formula. Statistical Science, 31(1):1–19.
- Bayesian inference on quasi-sparse count data. Biometrika, 103(4):971–983.
- A zero-inflated latent dirichlet allocation model for microbiome studies. Frontiers in Genetics, 11:602594.
- Nonparametric Bayes modeling of multivariate categorical data. Journal of the American Statistical Association, 104(487):1042–1051.
- Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430):577–588.
- Principled selection of hyperparameters in the Latent Dirichlet Allocation model. Journal of Machine Learning Research, 18(1):5937–5974.
- Good, I. J. (1976). On the application of symmetric Dirichlet distributions and their mixtures to contingency tables. The Annals of Statistics, 4(6):1159–1189.
- Bayesian inference for gamma models. arXiv preprint arXiv:1905.12141.
- Irwin, J. O. (1968). The generalized Waring distribution applied to accident theory. Journal of the Royal Statistical Society Series A: Statistics in Society, 131(2):205–225.
- Jeffreys, H. (1939). Theory of probability. Oxford University Press.
- Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 186(1007):453–461.
- Koslovsky, M. D. (2023). A Bayesian zero-inflated Dirichlet-multinomial regression model for multivariate compositional count data. Biometrics.
- Lambert, D. (1992). Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics, 34(1):1–14.
- Lidstone, G. J. (1920). Note on the general case of the Bayes-Laplace formula for inductive or a posteriori probabilities. Transactions of the Faculty of Actuaries, 8(182-192):13.
- An empirical Bayes approach to normalization and differential abundance testing for microbiome data. BMC bioinformatics, 21:1–18.
- Miller, J. W. (2019). Fast and accurate approximation of the full conditional for gamma shape parameters. Journal of Computational and Graphical Statistics, 28(2):476–480.
- Minka, T. (2000). Estimating a Dirichlet distribution.
- Perks, W. (1947). Some observations on inverse probability including a new indifference rule. Journal of the Institute of Actuaries, 73(2):285–334.
- Shrink globally, act locally: Sparse bayesian regularization and prediction. Bayesian statistics, 9(501-538):105.
- Bayesian inference for logistic models using Pólya–Gamma latent variables. Journal of the American statistical Association, 108(504):1339–1349.
- Rossell, D. (2009). GaGa: A parsimonious and flexible model for differential expression analysis. The Annals of Applied Statistics, pages 1035–1051.
- Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis. Biostatistics, 20(4):698–713.
- Tuyl, F. (2018). A method to handle zero counts in the multinomial model. The American Statistician.
- Wallach, H. M. (2006). Topic modeling: beyond bag-of-words. In Proceedings of the 23rd international conference on Machine learning, pages 977–984.
- Linking long-term dietary patterns with gut microbial enterotypes. Science, 334(6052):105–108.
- Scalable hyperparameter selection for latent dirichlet allocation. Journal of Computational and Graphical Statistics, 29(4):875–895.
- Testing overdispersion in the zero-inflated Poisson model. Journal of statistical planning and inference, 139(9):3340–3353.
- A zero-inflated logistic normal multinomial model for extracting microbial compositions. Journal of the American Statistical Association, pages 1–14.
- Bayesian factorizations of big sparse tensors. Journal of the American Statistical Association, 110(512):1562–1576.
- Bayesian nonparametric modeling of latent partitions via Stirling-gamma priors. arXiv preprint arXiv:2306.02360.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.