Papers
Topics
Authors
Recent
Search
2000 character limit reached

Manifold Gaussian Variational Bayes on the Precision Matrix

Published 26 Oct 2022 in stat.ML and cs.LG | (2210.14598v5)

Abstract: We propose an optimization algorithm for Variational Inference (VI) in complex models. Our approach relies on natural gradient updates where the variational space is a Riemann manifold. We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix. Our Manifold Gaussian Variational Bayes on the Precision matrix (MGVBP) solution provides simple update rules, is straightforward to implement, and the use of the precision matrix parametrization has a significant computational advantage. Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models. Over five datasets, we empirically validate our feasible approach on different statistical and econometric models, discussing its performance with respect to baseline methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Optimization algorithms on matrix manifolds. Princeton University Press.
  2. A novel hybrid rbf neural networks model as a forecaster. Statistics and Computing, 24(3):365–375.
  3. Barfoot, T. D. (2020). Multivariate gaussian variational inference by natural gradient descent. arXiv preprint arXiv:2001.10025.
  4. On the bures–wasserstein distance between positive definite matrices. Expositiones Mathematicae, 37(2):165–191.
  5. Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518):859–877.
  6. Weight uncertainty in neural network. In International Conference on Machine Learning, pages 1613–1622.
  7. Manopt, a Matlab toolbox for optimization on manifolds. Journal of Machine Learning Research, 15(42):1455–1459.
  8. Scalable Variational Inference for Bayesian Variable Selection in Regression, and Its Accuracy in Genetic Association Studies. Bayesian Analysis, 7(1):73 – 108.
  9. Corsi, F. (2009). A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics, 7(2):174–196.
  10. An introduction to variational inference. arXiv preprint arXiv:2108.13083.
  11. Graves, A. (2011). Practical variational inference for neural networks. Advances in neural information processing systems, 24.
  12. On riemannian optimization over positive definite matrices with the bures-wasserstein geometry. Advances in Neural Information Processing Systems, 34:8940–8953.
  13. Stochastic variational inference. Journal of Machine Learning Research.
  14. Matrix manifold optimization for gaussian mixtures. Advances in Neural Information Processing Systems, 28.
  15. A survey and comparison of contemporary algorithms for computing the matrix geometric mean. Electronic Transactions on Numerical Analysis, 39:379–402.
  16. Conjugate-computation variational inference: Converting variational inference in non-conjugate models to inferences in conjugate models. In Artificial Intelligence and Statistics, pages 878–887. Proceedings of Machine Learning Research.
  17. Fast and scalable bayesian deep learning by weight-perturbation in adam. In International Conference on Machine Learning, pages 2611–2620.
  18. Fast yet simple natural-gradient descent for variational inference in complex models. In 2018 International Symposium on Information Theory and Its Applications, pages 31–35.
  19. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  20. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
  21. Automatic variational inference in stan. Advances in neural information processing systems, 28.
  22. Bayesian approach for neural networks—review and case studies. Neural networks, 14(3):257–274.
  23. Structured second-order methods via natural gradient descent. arXiv preprint arXiv:2107.10884.
  24. Handling the positive-definite constraint in the bayesian learning rule. In International Conference on Machine Learning, pages 6116–6126. PMLR.
  25. Black-box optimizer with stochastic implicit natural gradient. In Machine Learning and Knowledge Discovery in Databases. Research Track, pages 217–232. Springer International Publishing.
  26. Mackay, D. J. C. (1992). Bayesian methods for adaptive models. PhD thesis, California Institute of Technology.
  27. Mackay, D. J. C. (1995). Probable networks and plausible predictions — a review of practical bayesian methods for supervised neural networks. Network: Computation In Neural Systems, 6:469–505.
  28. Bayesian learning for neural networks: an algorithmic survey. Artificial Intelligence Review, pages 1–51.
  29. Variational inference for garch-family models. In Proceedings of the Fourth ACM International Conference on AI in Finance, ICAIF’23, page 541–548. Association for Computing Machinery.
  30. Quasi black-box variational inference with natural gradients for bayesian learning. arXiv preprint arXiv:2205.11568.
  31. Bayesian bilinear neural network for predicting the mid-price dynamics in limit-order book markets. Journal of Forecasting, 42(6):1407–1428.
  32. Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika, 71(1):135–146.
  33. Monte carlo gradient estimation in machine learning. Journal of Machine Learning Research, 21(132):1–62.
  34. Mroz, T. A. (1984). The sensitivity of an empirical model of married women’s hours of work to economic and statistical assumptions. PhD thesis, Stanford University.
  35. Benchmark dataset for mid-price forecasting of limit order book data with machine learning methods. Journal of Forecasting, 37(8):852–866.
  36. Practical deep learning with bayesian principles. In Advances in Neural Information Processing Systems, volume 32, pages 1–13.
  37. Variational bayesian inference with stochastic search. arXiv preprint arXiv:1206.6430.
  38. Patton, A. J. (2011). Volatility forecast comparison using imperfect volatility proxies. Journal of Econometrics, 160(1):246–256.
  39. Pennec, X. (2020). Manifold-valued image processing with spd matrices. In Riemannian geometric statistics in medical image analysis, pages 75–134. Elsevier.
  40. Black box variational inference. In Artificial intelligence and statistics, pages 814–822.
  41. A stochastic approximation method. The annals of mathematical statistics, pages 400–407.
  42. Markov chain monte carlo and variational inference: Bridging the gap. In International conference on machine learning, pages 1218–1226. PMLR.
  43. On using control variates with stochastic approximation for variational bayes and its connection to stochastic linear regression. arXiv preprint arXiv:1401.1022.
  44. Mean field theory for sigmoid belief networks. Journal of artificial intelligence research, 4:61–76.
  45. Tan, L. S. (2021). Natural gradient updates for cholesky factor in gaussian and structured variational inference. arXiv preprint arXiv:2109.00375.
  46. Teräsvirta, T. (2009). An introduction to univariate garch models. In Handbook of financial time series, pages 17–42. Springer.
  47. Temporal attention-augmented bilinear network for financial time-series data analysis. IEEE Transactions on Neural Networks and Learning Systems, 30(5):1407–1418.
  48. Variational Bayes on manifolds. Statistics and Computing, 31(6):71.
  49. A practical tutorial on variational bayes. arXiv preprint arXiv:2103.01327.
  50. Boosting black-box variational inference by incorporating the natural gradient. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 19–24. IEEE.
  51. Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1–2):1–305.
  52. Natural evolution strategies. The Journal of Machine Learning Research, 15(1):949–980.
  53. Variance reduction properties of the reparameterization trick. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2711–2720. PMLR.
  54. Noisy natural gradient as variational inference. In International Conference on Machine Learning, pages 5852–5861. PMLR.
  55. A tale of two time scales: Determining integrated volatility with noisy high-frequency data. Journal of the American Statistical Association, 100(472):1394–1411.
Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 20 likes about this paper.