Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 172 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 34 tok/s Pro
GPT-5 High 40 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 436 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

EigenVI: score-based variational inference with orthogonal function expansions (2410.24054v1)

Published 31 Oct 2024 in stat.ML, cs.LG, and stat.CO

Abstract: We develop EigenVI, an eigenvalue-based approach for black-box variational inference (BBVI). EigenVI constructs its variational approximations from orthogonal function expansions. For distributions over $\mathbb{R}D$, the lowest order term in these expansions provides a Gaussian variational approximation, while higher-order terms provide a systematic way to model non-Gaussianity. These approximations are flexible enough to model complex distributions (multimodal, asymmetric), but they are simple enough that one can calculate their low-order moments and draw samples from them. EigenVI can also model other types of random variables (e.g., nonnegative, bounded) by constructing variational approximations from different families of orthogonal functions. Within these families, EigenVI computes the variational approximation that best matches the score function of the target distribution by minimizing a stochastic estimate of the Fisher divergence. Notably, this optimization reduces to solving a minimum eigenvalue problem, so that EigenVI effectively sidesteps the iterative gradient-based optimizations that are required for many other BBVI algorithms. (Gradient-based methods can be sensitive to learning rates, termination criteria, and other tunable hyperparameters.) We use EigenVI to approximate a variety of target distributions, including a benchmark suite of Bayesian models from posteriordb. On these distributions, we find that EigenVI is more accurate than existing methods for Gaussian BBVI.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. PyMC: a modern, and comprehensive probabilistic programming framework in Python. PeerJ Computer Science, 9:e1516, 2023.
  2. Advances in black-box VI: Normalizing flows, importance weighting, and optimization. Advances in Neural Information Processing Systems, 33, 2020.
  3. Sylvester normalizing flows for variational inference. Uncertainty in Artificial Intelligence, 2018.
  4. Pyro: Deep universal probabilistic programming. The Journal of Machine Learning Research, 20(1):973–978, 2019.
  5. Variational inference: A review for statisticians. Journal of the American Statistical Association, 112(518):859–877, 2017.
  6. Batch and match: black-box variational inference with a score-based divergence. In International Conference on Machine Learning, 2024.
  7. Stan: A probabilistic programming language. Journal of Statistical Software, 76(1):1–32, 2017.
  8. R.Ā Courant and D.Ā Hilbert. Methoden der Mathematischen Physik, volumeĀ 1. Julius Springer, Berlin, 1924.
  9. Kernel exponential family estimation via doubly dual embedding. In International Conference on Artificial Intelligence and Statistics. PMLR, 2019.
  10. Robust, accurate stochastic optimization for variational inference. Advances in Neural Information Processing Systems, 33, 2020.
  11. Challenges and opportunities in high dimensional variational inference. Advances in Neural Information Processing Systems, 34, 2021.
  12. Density estimation using real NVP. International Conference on Learning Representations, 2017.
  13. Turing: a language for flexible probabilistic inference. In International Conference on Artificial Intelligence and Statistics. PMLR, 2018.
  14. Nonparametric variational inference. International Conference on Machine Learning, 2012.
  15. Black box variational inference with a deterministic objective: Faster, more accurate, and even more black box. Journal of Machine Learning Research, 25(18):1–39, 2024.
  16. Introduction to Quantum Mechanics. Cambridge University Press, 2018.
  17. Boosting variational inference. arXiv preprint arXiv:1611.05559, 2016.
  18. A. Hyvärinen. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(4), 2005.
  19. C.Ā Jones and A.Ā Pewsey. Sinh-arcsinh distributions. Biometrika, 96(4):761–780, 2009.
  20. C.Ā Jones and A.Ā Pewsey. The sinh-arcsinh normal distribution. Significance, 16(2):6–7, 2019.
  21. An introduction to variational methods for graphical models. Machine Learning, 37:183–233, 1999.
  22. T.Ā Kim and Y.Ā Bengio. Deep directed generative models with energy-based probability estimation. arXiv preprint arXiv:1606.03439, 2016.
  23. D.Ā P. Kingma and M.Ā Welling. Auto-encoding variational Bayes.

Summary

  • The paper introduces EigenVI, a novel BBVI method that replaces traditional gradient descent with eigenvalue minimization using score matching.
  • It systematically approximates non-Gaussian, multimodal distributions by employing orthogonal function expansions, including weighted Hermite polynomials.
  • Experimental results demonstrate that EigenVI outperforms standard Gaussian methods, enhancing robustness and reducing hyperparameter tuning in Bayesian models.

EigenVI: Score-based Variational Inference with Orthogonal Function Expansions

The paper presents EigenVI, a new approach to black-box variational inference (BBVI) that utilizes eigenvalue-based score matching in conjunction with orthogonal function expansions. The approach is tailored for approximating target distributions that can exhibit substantial non-Gaussian behavior. EigenVI offers a systematic method to develop variational approximations through orthogonal function bases, providing a framework for modeling complex and diverse distributions beyond simple Gaussian behavior.

Background and Motivation

Variational inference (VI) has been pivotal for scalable Bayesian inference, offering a structured optimization approach to approximate posterior distributions with certain tractable families. Despite its efficacy, traditional gradient-based VI approaches face challenges with tuning hyperparameters such as learning rates and iteration thresholds, which can complicate the optimization process. EigenVI addresses these by using orthogonal expansions and Fisher divergence minimization, avoiding the typical iterative gradient descent.

Methodology

EigenVI leverages orthogonal basis functions to create a family of polynomial expansions that captures diverse distribution properties. The expansions allow for systematic modeling of non-Gaussian features by adding higher-order terms. The choice of orthogonal functions, including weighted Hermite polynomials, facilitates the handling of different types of support and distribution characteristics.

Rather than employing traditional stochastic gradient descent, EigenVI minimizes a score-based Fisher divergence through a minimum eigenvalue problem. This is a notable shift, as it mitigates tuning sensitivity and convergence issues typical in gradient-based methods. The stochastic estimate of the Fisher divergence is calculated via importance sampling, and the solution reduces to an eigenvalue problem, making it computationally efficient.

Results and Computational Experiments

EigenVI demonstrates superior performance over standard Gaussian BBVI methods across several examples, highlighting its capability to model multimodal distributions, asymmetries, and heavy tails. The experiments encompass synthetic distributions and benchmarks from the posteriordb suite, including complex Bayesian models. These highlight EigenVI’s ability to deliver more accurate posterior approximations compared to leading BBVI techniques, confirming its utility for diverse and challenging inference tasks.

The numerical results revealed EigenVI's robustness in approximating both skewed and heavy-tailed targets effectively, significantly outperforming traditional Gaussian methods. The handling of complex distributions, verified through careful comparisons with ground truth and other inference methods, underscores its practical impact.

Implications and Future Directions

EigenVI positions itself as a valuable alternative in the BBVI toolkit. Its independence from hyperparameter tuning for optimization advances practical usability, providing a fertile area for further exploration in variational inference. Future research may investigate its applicability to even higher-dimensional spaces and the development of adaptive schemes for importance sampling within the EigenVI framework.

Furthermore, there is potential to explore different orthogonal expansions and extend this approach to various types of models and data structures. Examining the theoretical bounds of the eigenvalue minimize under different conditions and decomposing larger matrices for efficient parallel computation opens future development paths.

In summary, EigenVI's novel use of orthogonal function expansions and score-based optimization via eigenvalue problems yields a compelling approach for approximating intricate distributions in variational inference, marking it as a promising advancement in the field.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We found no open problems mentioned in this paper.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 5 tweets and received 326 likes.

Upgrade to Pro to view all of the tweets about this paper: