Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learned harmonic mean estimation of the marginal likelihood with normalizing flows (2307.00048v3)

Published 30 Jun 2023 in stat.ME, astro-ph.IM, and stat.ML

Abstract: Computing the marginal likelihood (also called the Bayesian model evidence) is an important task in Bayesian model selection, providing a principled quantitative way to compare models. The learned harmonic mean estimator solves the exploding variance problem of the original harmonic mean estimation of the marginal likelihood. The learned harmonic mean estimator learns an importance sampling target distribution that approximates the optimal distribution. While the approximation need not be highly accurate, it is critical that the probability mass of the learned distribution is contained within the posterior in order to avoid the exploding variance problem. In previous work a bespoke optimization problem is introduced when training models in order to ensure this property is satisfied. In the current article we introduce the use of normalizing flows to represent the importance sampling target distribution. A flow-based model is trained on samples from the posterior by maximum likelihood estimation. Then, the probability density of the flow is concentrated by lowering the variance of the base distribution, i.e. by lowering its "temperature", ensuring its probability mass is contained within the posterior. This approach avoids the need for a bespoke optimisation problem and careful fine tuning of parameters, resulting in a more robust method. Moreover, the use of normalizing flows has the potential to scale to high dimensional settings. We present preliminary experiments demonstrating the effectiveness of the use of flows for the learned harmonic mean estimator. The harmonic code implementing the learned harmonic mean, which is publicly available, has been updated to now support normalizing flows.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Current challenges in Bayesian model choice. In Statistical Challenges in Modern Astronomy IV; Astronomical Society of the Pacific: San Francisco, CA, USA, 2007; Volume 371, p. 224.
  2. Estimating the evidence—A review. Stat. Neerl. 2012, 66, 288–308. [CrossRef]
  3. Skilling, J. Nested sampling for general Bayesian computation. Bayesian Anal. 2006, 1, 833–859. [CrossRef]
  4. Nested sampling for physical scientists. Nat. Rev. Methods Prim. 2022, 2, 39. [CrossRef]
  5. Machine learning assisted Bayesian model comparison: Learnt harmonic mean estimator. arXiv 2023, arXiv:2111.12720.
  6. Bayesian model comparison for simulation-based inference. arXiv 2022, arXiv:2207.04037.
  7. Normalizing flows for probabilistic modeling and inference. J. Mach. Learn. Res. 2021, 22, 2617–2680.
  8. Approximate Bayesian inference with the weighted likelihood bootstrap. J. R. Stat. Soc. Ser. Stat. Methodol. 1994, 56, 3–26. [CrossRef]
  9. Neal, R.M. Contribution to the discussion of “Approximate Bayesian inference with the weighted likelihood bootstrap” by Newton MA, Raftery AE. J. R. Stat. Soc. Ser. Stat. Methodol. 1994, 56, 41–42.
  10. Bayesian model choice: Asymptotics and exact calculations. J. R. Stat. Soc. Ser. Stat. Methodol. 1994, 56, 501–514. [CrossRef]
  11. Density estimation using real nvp. arXiv 2016, arXiv:1605.08803.
  12. emcee: The MCMC Hammer. Publ. Astron. Soc. Pac. 2013, 125, 306–312. [CrossRef]
  13. Bayesian Theory; John Wiley & Sons: New York, NY, USA, 1994.
  14. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Annual Symposium on Computer Application in Medical Care, Washington, DC, USA, 6–9 November 1988; American Medical Informatics Association: Bethesda, MD, USA, 1988; p. 261.
  15. Green, P.J. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 1995, 82, 711–732. [CrossRef]
Citations (5)

Summary

We haven't generated a summary for this paper yet.