Batch and match: black-box variational inference with a score-based divergence (2402.14758v2)

Published 22 Feb 2024 in stat.ML, cs.AI, cs.LG, and stat.CO

Abstract: Most leading implementations of black-box variational inference (BBVI) are based on optimizing a stochastic evidence lower bound (ELBO). But such approaches to BBVI often converge slowly due to the high variance of their gradient estimates and their sensitivity to hyperparameters. In this work, we propose batch and match (BaM), an alternative approach to BBVI based on a score-based divergence. Notably, this score-based divergence can be optimized by a closed-form proximal update for Gaussian variational families with full covariance matrices. We analyze the convergence of BaM when the target distribution is Gaussian, and we prove that in the limit of infinite batch size the variational parameter updates converge exponentially quickly to the target mean and covariance. We also evaluate the performance of BaM on Gaussian and non-Gaussian target distributions that arise from posterior inference in hierarchical and deep generative models. In these experiments, we find that BaM typically converges in fewer (and sometimes significantly fewer) gradient evaluations than leading implementations of BBVI based on ELBO maximization.

References (43)

Authors (7)

Diana Cai (15 papers)
Chirag Modi (54 papers)
Loucas Pillaud-Vivien (19 papers)
Charles C. Margossian (20 papers)
Robert M. Gower (41 papers)
David M. Blei (110 papers)
Lawrence K. Saul (9 papers)

Citations (6)

View on Semantic Scholar

Summary

The paper introduces the Batch and Match framework that employs a score-based divergence to achieve closed-form updates for Gaussian variational families.
It demonstrates exponential convergence toward the target distribution, reducing gradient variance and speeding up inference compared to traditional VI methods.
Empirical evaluations reveal that BaM outperforms conventional approaches under various conditions, highlighting its robustness and potential for broader applications.

Exploring Score-based Divergence in Variational Inference: The Batch and Match Approach

Introduction to Score-based Variational Inference

Variational inference (VI) is a widespread method for approximating complex probabilistic models, especially in the context of posterior inference within Bayesian frameworks. However, traditional VI approaches, particularly those optimizing the Kullback-Leibler (KL) divergence, are known for their slow convergence and the high variance of gradient estimates, which can significantly hinder their performance. To address these limitations, we introduce "Batch and Match" (BaM), a novel framework for Black-Box Variational Inference (BBVI) that operates on a score-based divergence metric. Unlike conventional methods, BaM enables closed-form proximal updates for Gaussian variational families with full covariance matrices and demonstrates theoretical and empirical superiority in terms of convergence rate and accuracy.

Theoretical Foundations: BaM Algorithm

BaM diverges from the traditional VI objectives by leveraging a score-based divergence, which measures the agreement in the gradients of the log densities (scores) between the target and variational distributions. This score-based divergence has certain properties – specifically, non-negativity, equality, and affine invariance – that make it particularly suited for the task of variational inference while allowing for an invariant measure of similarity under affine transformations of the input.

Central to the BaM algorithm are two alternating steps: a "batch" step that estimates divergence using a batch of samples from the approximation to the target, and a "match" step that updates the variational approximation to match the scores at these samples. This iterative process converges toward a distribution that minimizes the score-based divergence from the target.

Convergence Analysis for Gaussian Targets

When applied to Gaussian target distributions, BaM's iterative process shows exponential convergence to the target mean and covariance in the limit of infinite batch size. This remarkable property is theoretically supported for all fixed levels of regularization across all initialization points. Such strong convergence guarantees, even in the simplified setting of Gaussian targets, provide a solid foundation for the method's empirical performance.

Empirical Evaluation

Empirically, BaM is tested against leading BBVI methods under various conditions, including Gaussian and non-Gaussian targets, which arise in hierarchical and deep generative models. Unlike traditional approaches that often struggle with high-dimensional problems and exhibit sensitivity to learning rates, BaM shows faster convergence and higher accuracy. These benefits are particularly prominent with larger batch sizes, showcasing BaM's robustness to initialization and regularization.

Future Directions

Despite its advantages, the application of BaM to non-Gaussian variational families and the analysis of its convergence in the finite-batch scenario remain open for exploration. Furthermore, extending the scope of score-based divergence beyond VI to other domains like goodness-of-fit testing could yield interesting insights due to its affine-invariance property.

Conclusion

The "Batch and Match" approach presents a significant step forward in the field of variational inference, addressing many of the shortcomings of existing methods. By centering on a score-based divergence and enabling efficient, closed-form updates, BaM not only speeds up convergence but also broadens the applicability of VI to more complex distributions. Its theoretical underpinning and empirical success lay the groundwork for further advancements in score-based methods for probabilistic modeling, with the potential to enhance a wide range of applications in statistics and machine learning.

PDF Markdown

Related Papers

Tweets

https://twitter.com/gowerrobert/status/1815746651849396722

https://twitter.com/dianarycai/status/1761030835476521343

https://twitter.com/charlesm993/status/1814562747490685308

https://twitter.com/gil2rok/status/1796158656100045032

https://twitter.com/StatCOupdates/status/1760909038098018565