Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 189 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 36 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 75 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 443 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

A review of Monte Carlo-based versions of the EM algorithm (2401.00945v1)

Published 1 Jan 2024 in stat.CO and stat.ME

Abstract: The EM algorithm is a powerful tool for maximum likelihood estimation with missing data. In practice, the calculations required for the EM algorithm are often intractable. We review numerous methods to circumvent this intractability, all of which are based on Monte Carlo simulation. We focus our attention on the Monte Carlo EM (MCEM) algorithm and its various implementations. We also discuss some related methods like stochastic approximation and Monte Carlo maximum likelihood. Generating the Monte Carlo samples necessary for these methods is, in general, a hard problem. As such, we review several simulation strategies which can be used to address this challenge. Given the wide range of methods available for approximating the EM, it can be challenging to select which one to use. We review numerous comparisons between these methods from a wide range of sources, and offer guidance on synthesizing the findings. Finally, we give some directions for future research to fill important gaps in the existing literature on the MCEM algorithm and related methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (88)
  1. Importance sampling: Intrinsic dimension and computational cost. Statistical Science, 32(3), 2017.
  2. A non linear mixed effects model of plant growth and estimation via stochastic variants of the EM algorithm. Communications in Statistics - Theory and Methods, 45(6), 2016.
  3. Julia: a fresh approach to numerical computing. SIAM Review, 59(1), 2017.
  4. Variational inference: a review for statisticians. Journal of the American Statistical Association, 112(518), 2017.
  5. Maximizing generalized linear mixed model likelihoods with an automated monte carlo em algorithm. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 61(1), 1999.
  6. A survey of Monte Carlo algorithms for maximizing the likelihood of a two-stage hierarchical model. Statistical Modelling, 1(4), 2001.
  7. Vivek S. Borkar. Stochastic Approximation: A dynamical systems viewpoint. Springer, 2nd edition, 2022.
  8. Léon Bottou. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers, 2010.
  9. Adaptive importance sampling: The past, the present, and the future. IEEE Signal Processing Magazine, 34(4), 2017.
  10. Ascent-based Monte Carlo expectation-maximization. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 67(2), 2005.
  11. Russel E. Caflisch. Monte Carlo and quasi-Monte Carlo methods. Acta Numerica, 7, 1998.
  12. Li Cai. High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrica, 75(1), 2010.
  13. Inference in hidden Markov models. Springer, 2005.
  14. G. Celeux and J. Diebolt. The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly, 2(1), 1985.
  15. The EM and SEM algorithms for mixtures: statistical and numerical aspects. Technical report, INRIA, 1987.
  16. A stochastic approximation type EM algorithm for the mixture problem. Stochastics and Stochastic Reports, 41(1-2), 1992.
  17. On stochastic versions of the EM algorithm. Technical report, INRIA, 1995.
  18. Monte Carlo EM estimation for time series models involving counts. Journal of the American Statistical Association, 90(429), 1995.
  19. An Introduction to Sequential Monte Carlo. Springer, 2020.
  20. Laura Dean. Blood groups and red cell antigens [internet]. Technical report, National Center for Biotechnology Information (US), 2005.
  21. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3), 2006.
  22. Convergence of a stochastic approximation version of the EM algorithm. The Annals of Statistics, 27(1), 1999.
  23. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1977.
  24. Harold F. Dorn. The relationship of cancer of the lung and the use of tobacco. The American Statistician, 8(5), 1954.
  25. Aryeh Dvoretzky. On stochastic approximation. In Jerzy Neyman, editor, Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, 1956.
  26. Generalized multiple importance sampling. Statistical Science, 34(1), 2019.
  27. Advances in importance sampling. arXiv:2102.05407v3, 2022.
  28. Convergence of the Monte Carlo expectation maximization for curved exponential families. The Annals of Statistics, 31(4), 2003.
  29. The distribution of the ABO blood groups in Japan. Japanese Journal of Human Genetics, 23, 1978.
  30. Maximum-likelihood estimation for constrained- or missing-data models. The Canadian Journal of Statistics, 21(3), 1993.
  31. Bayesian Data Analysis. CRC Press, third edition, 2013.
  32. Charles J. Geyer. Markov chain Monte Carlo maximum likelihood. Interface Foundation of North America, 1991.
  33. Charles J. Geyer. On the convergence of Monte Carlo maximum likelihood calculations. Journal of the Royal Statistical Society. Series B (Methodological), 56(1), 1994.
  34. A stochastic approximation algorithm with Markov chain Monte-Carlo method for incomplete data estimation problems. Proceedings of the National Academy of Sciences, 95(13), 1998.
  35. A stochastic approximation algorithm for maximum-likelihood estimation with incomplete data. The Canadian Journal of Statistics, 26(4), 1998.
  36. Maximum likelihood estimation for spatial models by Markov Chain Monte Carlo stochastic approximation. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 63(2), 2001.
  37. The Elements of Statistical Learning. Springer, 2nd edition, 2009.
  38. Edward L. Ionides. Truncated importance sampling. Journal of Computational and Graphical Statistics, 17(2), 2008.
  39. Jank. Quasi-Monte Carlo sampling to improve the efficiency of Monte Carlo EM. Computational Statistics and Data Analysis, 48(4), 2005.
  40. Wolfgang Jank. Implementing and diagnosing the stochastic approximation EM algorithm. Journal of Computational and Graphical Statistics, 15(4), 2006a.
  41. Wolfgang Jank. The EM algorithm, its randomized implementation and global optimization: Some challenges and opportunities for operations research. In Francis B. Alt, Michael C. Fu, and Bruce L. Golden, editors, Perspectives in Operations Research. 2006b.
  42. Wolfgang Jank. Ascent EM for fast and global solutions to finite mixtures: An application to curve-clustering of online auctions. Computational Statistics and Data Analysis, 51(2), 2006c.
  43. Efficiency of Monte Carlo EM and simulated likelihood in two-stage hierarchical models. Journal of Computational and Graphical Statistics, 12(1), 2003.
  44. J. Kiefer and J. Wolfowitz. Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23(3), 1952.
  45. E. Kuhn and M. Lavielle. Maximum likelihood estimation in nonlinear mixed effects models. Computational Statistics and Data Analysis, 49, 2005.
  46. Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM: Probability and Statistics, 8, 2004.
  47. Anthony Y. C. Kuk and Yuk W. Cheng. The Monte Carlo Newton-Raphson algorithm. Journal of Statistical Computation and Simulation, 59(3), 1997.
  48. Quasi-Monte Carlo for highly structured generalized response models. Methodology and Computing in Applied Probability, 10, 2008.
  49. Stochastic Approximation Algorithms and Applications. Springer, 1997.
  50. Stochastic Approximation and Recursive Algorithms and Applications. Springer, 2nd edition, 2003.
  51. Tze Leung Lai. Stochastic approximation: invited paper. The Annals of Statistics, 31(2), 2003.
  52. Recent advances in randomized quasi-Monte Carlo methods. In Moshe Dror, Pierre L’Ecuyer, and Ferenc Szidarovszky, editors, Modeling Uncertainty: An examination of stochastic theory, methods, and applications. Springer, 2002.
  53. Implementations of the Monte Carlo EM algorithm. Journal of Computational and Graphical Statistics, 10(3), 2001.
  54. An automated (Markov chain) Monte Carlo EM algorithm. Journal of Statistical Computation and Simulation, 74(5), 2004.
  55. Thomas A. Louis. Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 44(2), 1982.
  56. P. McCullagh and J. A. Nelder. Generalized Linear Models. Chapman & Hall/CRC, 2nd edition, 1989.
  57. Charles E. McCulloch. Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association, 92(437), 1997.
  58. The EM algorithm and extensions. Wiley, 2nd edition, 2008.
  59. Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrica, 80(2), 1993.
  60. On the global and componentwise rates of convergence of the EM algorithm. Linear Algebra and its Applications, 199, 1994.
  61. Sequential Monte Carlo EM for multivariate probit models. Computational Statistics and Data Analysis, 72, 2014.
  62. Ronald C. Neath. On convergence properties of the Monte Carlo EM algorithm. Advances in Modern Statistical Theory and Applications, 10, 2013.
  63. Søren Feodor Nielsen. On simulated EM algorithms. Journal of Econometrics, 96(2), 2000a.
  64. Søren Feodor Nielsen. The stochastic EM algorithm: estimation and asymptotic results. Bernoulli, 6(3), 2000b.
  65. Numerical Optimization. Springer, 2nd edition, 2006.
  66. David Oakes. Direct calculation of the information matrix via the EM algorithm. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 61(2), 1999.
  67. Acceleration of stochastic approximation by averaging. SIAM Journal on Control and Optimization, 30(4), 1992.
  68. Monte Carlo EM with importance reweighting and its applications in random effects models. Computational Statistics & Data Analysis, 29, 1999.
  69. Using motorettes for the experimental and numerical determinations of the pdiv in an electric motor. In 2016 IEEE International Conference on Dielectrics (ICD), volume 2, pages 967–970, 2016.
  70. Maximum likelihood inference for multivariate frailty models using an automated Monte Carlo EM algorithm. Lifetime Data Analysis, 8, 2002.
  71. A stochastic approximation method. The Annals of Mathematical Statistics, 22(3), 1951.
  72. Monte Carlo statistical methods. Springer, second edition, 2004.
  73. MCEM_Survey, 2023. URL https://github.com/wruth1/MCEM_Survey.
  74. The most-cited statistical papers. Journal of Applied Statistics, 32(5), 2005.
  75. Variance Components. Wiley Interscience, 2006.
  76. Stan Development Team. Stan modelling language users guide and reference manual, 2022. URL http://mc-stan.org/. Version 2.32.
  77. Stan Development Team. RStan: the R interface to Stan, 2023. URL https://mc-stan.org/. R package version 2.21.8.
  78. Martin A. Tanner. Tools for Statistical Inference. Springer, 2nd edition, 1993.
  79. Martin A. Tanner. Tools for Statistical Inference. Springer, 3rd edition, 1996.
  80. Trevezas and Cournède. A sequential Monte Carlo approach for the MLE in a plant growth model. Journal of Agricultural, Biological, and Environmental Statistics, 18(2), 2013.
  81. Parameter estimation via stochastic variants of the ecm algorithm with applications to plant growth modelling. Computational Statistics and Data Analysis, 78, 2014.
  82. The variational approximation for Bayesian inference. IEEE Signal Processing Magazine, 25(6), 2008.
  83. A. W. van der Vaart. Asymptotic Statistics. Cambridge University Press, 1998.
  84. The top 100 papers. Nature News, 514(7524):550, 2014.
  85. Pareto smoothed importance sampling, 2022.
  86. Greg C. G. Wei and Martin A. Tanner. A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. Journal of the American Statistical Association, 85(411), 1990.
  87. C. F. Jeff Wu. On the convergence properties of the EM algorithm. The Annals of Statistics, 11(1), 1983.
  88. Xiao-Hua Zhou. Challenges and strategies in analysis of missing data. Biostatistics & Epidemiology, 4(1), 2020.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: