Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Logistic-beta processes for dependent random probabilities with beta marginals (2402.07048v2)

Published 10 Feb 2024 in stat.ME and stat.ML

Abstract: The beta distribution serves as a canonical tool for modelling probabilities in statistics and machine learning. However, there is limited work on flexible and computationally convenient stochastic process extensions for modelling dependent random probabilities. We propose a novel stochastic process called the logistic-beta process, whose logistic transformation yields a stochastic process with common beta marginals. Logistic-beta processes can model dependence on both discrete and continuous domains, such as space or time, and have a flexible dependence structure through correlation kernels. Moreover, its normal variance-mean mixture representation leads to effective posterior inference algorithms. We illustrate the benefits through nonparametric binary regression and conditional density estimation examples, both in simulation studies and in a pregnancy outcome application.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (86)
  1. Scale mixtures of normal distributions. Journal of the Royal Statistical Society: Series B (Methodological), 36(1):99–102.
  2. Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society Series B: Statistical Methodology, 72(3):269–342.
  3. Bayesian nonparametric dependent model for partially replicated data: The influence of fuel spills on species diversity. The Annals of Applied Statistics, 10(3):1496–1516.
  4. Nonparametric priors with full-range borrowing of information. Biometrika, in press.
  5. Clustering consistency with Dirichlet process mixtures. Biometrika, 110(2):551–558.
  6. Gaussian predictive process models for large spatial data sets. Journal of the Royal Statistical Society Series B: Statistical Methodology, 70(4):825–848.
  7. Normal variance-mean mixtures and z distributions. International Statistical Review, 50(2):145–159.
  8. On the support of MacEachern’s dependent Dirichlet processes and extensions. Bayesian Analysis, 7(2):277–310.
  9. Beta-product dependent Pitman–Yor processes for Bayesian inference. Journal of Econometrics, 180(1):49–72.
  10. Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions. Bulletin of the American Mathematical Society, 38(4):435–465.
  11. Stan: A probabilistic programming language. Journal of Statistical Software, 76(1):1–32.
  12. Bayesian modeling of correlated binary responses via scale mixture of multivariate normal link functions. Sankhyā: The Indian Journal of Statistics, Series A, 60(3):322–343.
  13. Nonparametric Bayes conditional distribution modeling with variable selection. Journal of the American Statistical Association, 104(488):1646–1660.
  14. The local Dirichlet process. Annals of the Institue of Statistical Mathematics, 63(1):59–80.
  15. BNPmix: An R package for Bayesian nonparametric modeling via Pitman-Yor mixtures. Journal of Statistical Software, 100(15):1–33.
  16. Basis-function models in spatial statistics. Annual Review of Statistics and Its Application, 9(1):373–400.
  17. Bayesian nonparametric mixture modeling for temporal dynamics of gender stereotypes. The Annals of Applied Statistics, 17(3):2256–2278.
  18. Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer New York.
  19. Devroye, L. (2009). On exact simulation algorithms for some distributions related to Jacobi theta functions. Statistics & Probability Letters, 79(21):2251–2259.
  20. Modeling for dynamic ordinal regression relationships: An application to estimating maturity of rockfish in California. Journal of the American Statistical Association, 113(521):68–80.
  21. Kernel stick-breaking processes. Biometrika, 95(2):307–323.
  22. Species distribution models: Ecological explanation and prediction across space and time. Annual Review of Ecology, Evolution, and Systematics, 40(1):677–697.
  23. Correlation and dependence in risk management: properties and pitfalls. Risk Management: Value at Risk and Beyond, 1:176–223.
  24. Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1(2):209–230.
  25. Ferguson, T. S. (1974). Prior distributions on spaces of probability measures. The Annals of Statistics, 2(4):615–629.
  26. Improving the performance of predictive process modeling for large datasets. Computational Statistics & Data Analysis, 53(8):2873–2884.
  27. mcmcse: Monte Carlo standard errors for MCMC. R package version 1.5.0.
  28. Bayesian Data Analysis. Chapman and Hall/CRC.
  29. Fundamentals of Nonparametric Bayesian Inference. Cambridge University Press.
  30. Classification in a normalized feature space using support vector machines. IEEE Transactions on Neural Networks, 14(3):597–605.
  31. Simulation-based regularized logistic regression. Bayesian Analysis, 7(3):567–590.
  32. Order-based dependent Dirichlet processes. Journal of the American Statistical Association, 101(473):179–194.
  33. The Indian buffet process: An introduction and review. Journal of Machine Learning Research, 12(32):1185–1224.
  34. Grigelionis, B. (2008). On Pólya mixtures of multivariate Gaussian distributions. Statistics & Probability Letters, 78(12):1459–1465.
  35. Hjort, N. L. (1990). Nonparametric Bayes estimators based on beta processes in models for life history data. The Annals of Statistics, 18(3):1259–1294.
  36. Bayesian Nonparametrics. Cambridge University Press.
  37. Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Analysis, 1(1):145–168.
  38. A tree perspective on stick-breaking models in covariate-dependent mixtures. arXiv preprint arXiv:2208.02806.
  39. Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96(453):161–173.
  40. Joe, H. (2006). Range of correlation matrices for dependent random variables with given marginal distributions. In Advances in Distribution Theory, Order Statistics, and Inference, pages 125–142. Birkhäuser Boston.
  41. Joe, H. (2014). Dependence Modeling with Copulas. CRC Press.
  42. Continuous Univariate Distributions, Volume 2. John Wiley & Sons.
  43. Structured mixture of continuation-ratio logits models for ordinal regression. arXiv preprint arXiv:2211.04034.
  44. Katzfuss, M. (2017). A multi-resolution approximation for massive spatial datasets. Journal of the American Statistical Association, 112(517):201–214.
  45. Kingman, J. F. C. (1967). Completely random measures. Pacific Journal of Mathematics, 21(1):59–78.
  46. Lee, C. J. (2023). Loss-based objective and penalizing priors for model selection problems. arXiv preprint arXiv:2311.13347.
  47. Hierarchical generalized linear models. Journal of the Royal Statistical Society: Series B (Methodological), 58(4):619–656.
  48. Association between maternal serum concentration of the DDT metabolite DDE and preterm and small-for-gestational-age babies at birth. The Lancet, 358(9276):110–114.
  49. MacEachern, S. N. (1999). Dependent nonparametric processes. In ASA Proceedings of the Section on Bayesian Statistical Science, volume 1, pages 50–55.
  50. MacEachern, S. N. (2000). Dependent Drichlet processes. Technical report, Department of Statistics, The Ohio State University.
  51. Bayesian Nonparametric Data Analysis. Springer International Publishing.
  52. Some bivariate beta distributions. Statistics, 39(5):457–466.
  53. A time-series DDP for functional proteomics profiles. Biometrics, 68(3):859–868.
  54. A bivariate beta distribution. Statistics & Probability Letters, 62(4):407–412.
  55. Constructions for a bivariate beta distribution. Statistics & Probability Letters, 96:54–60.
  56. NIST Handbook of Mathematical Functions. Cambridge University Press.
  57. Size-biased sampling of Poisson point processes and excursions. Probability Theory and Related Fields, 92(1):21–39.
  58. Poisson random fields for dynamic feature models. Journal of Machine Learning Research, 18:1–45.
  59. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Annals of Probability, 25(2):855–900.
  60. CODA: convergence diagnosis and output analysis for MCMC. R package version 0.19.4.
  61. Data augmentation for non-Gaussian regression models using variance-mean mixtures. Biometrika, 100(2):459–471.
  62. Bayesian inference for logistic models using Pólya–Gamma latent variables. Journal of the American Statistical Association, 108(504):1339–1349.
  63. A unifying view of sparse approximate Gaussian process regression. Journal of Machine Learning Research, 6(65):1939–1959.
  64. The dependent Dirichlet process and related models. Statistical Science, 37(1):24–41.
  65. R Core Team (2023). R: A language and environment for statistical computing.
  66. Gaussian Processes for Machine Learning. MIT Press.
  67. Logistic stick-breaking process. Journal of Machine Learning Research, 12(1):203–239.
  68. Tractable Bayesian density regression via logit stick-breaking priors. Journal of Statistical Planning and Inference, 211:131–142.
  69. Monte Carlo Statistical Methods, volume 2. Springer New York.
  70. Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms. Journal of Applied Probability, 44(2):458–475.
  71. Examples of adaptive MCMC. Journal of Computational and Graphical Statistics, 18(2):349–367.
  72. Nonparametric Bayesian models through probit stick-breaking processes. Bayesian Analysis, 6(1):145–177.
  73. Probabilistic aspects of Jacobi theta functions. arXiv preprint arXiv:2303.05942.
  74. A full scale approximation of covariance functions for large spatial data sets. Journal of the Royal Statistical Society Series B: Statistical Methodology, 74(1):111–132.
  75. Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 4(2):639–650.
  76. Bivariate beta-LSTM. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 5818–5825.
  77. Taddy, M. A. (2010). Autoregressive mixture models for dynamic spatial Poisson processes: Application to tracking intensity of violent crime. Journal of the American Statistical Association, 105(492):1403–1417.
  78. The multivariate beta process and an extension of the Polya tree model. Biometrika, 98(1):17–34.
  79. Tutz, G. (1991). Sequential models in categorical regression. Computational Statistics & Data Analysis, 11(3):275–295.
  80. Multivariate output analysis for Markov chain Monte Carlo. Biometrika, 106(2):321–337.
  81. Vecchia, A. V. (1988). Estimation and model identification for continuous spatial processes. Journal of the Royal Statistical Society: Series B (Methodological), 50(2):297–312.
  82. Bayesian dependent mixture models: A predictive comparison and survey. arXiv preprint arXiv:2307.16298.
  83. Robust multi-task learning with t-processes. In Proceedings of the 24th International Conference on Machine Learning, pages 1103–1110.
  84. Beta diffusion. In Advances in Neural Information Processing Systems, volume 36.
  85. Bayesian nonparametric modeling of latent partitions via Stirling-gamma priors. arXiv preprint arXiv:2306.02360.
  86. Bayesian modeling of sequential discoveries. Journal of the American Statistical Association, 118(544):2521–2532.

Summary

We haven't generated a summary for this paper yet.