NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport (1903.03704v1)

Published 9 Mar 2019 in stat.CO and stat.ML

Abstract: Hamiltonian Monte Carlo is a powerful algorithm for sampling from difficult-to-normalize posterior distributions. However, when the geometry of the posterior is unfavorable, it may take many expensive evaluations of the target distribution and its gradient to converge and mix. We propose neural transport (NeuTra) HMC, a technique for learning to correct this sort of unfavorable geometry using inverse autoregressive flows (IAF), a powerful neural variational inference technique. The IAF is trained to minimize the KL divergence from an isotropic Gaussian to the warped posterior, and then HMC sampling is performed in the warped space. We evaluate NeuTra HMC on a variety of synthetic and real problems, and find that it significantly outperforms vanilla HMC both in time to reach the stationary distribution and asymptotic effective-sample-size rates.

Citations (99)

View on Semantic Scholar

Summary

The paper presents NeuTra HMC, a method that integrates inverse autoregressive flows to transform challenging posterior geometries for enhanced sampling efficiency.
It shows that NeuTra HMC can achieve up to an order of magnitude faster mixing rates and higher effective sample sizes than standard HMC.
The approach bridges variational inference and Hamiltonian dynamics by automating parameter tuning through pre-trained neural transport maps, simplifying applications in complex Bayesian models.

NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport

The paper "NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport" proposes enhancing Hamiltonian Monte Carlo (HMC) by addressing challenges related to posterior distribution geometry through the technique of neural transport, specifically inverse autoregressive flows (IAF). The objective is to mitigate issues presented by unfavorable posterior geometries that can hinder efficient mixing and convergence of HMC.

Methodological Overview

Hamiltonian Monte Carlo is favored for sampling from high-dimensional continuous distributions due to its use of gradient information. However, when dealing with geometrically complex target distributions, conventional HMC can suffer from slow mixing rates, necessitating numerous gradient evaluations. The proposed NeuTra HMC addresses this by incorporating IAFs to learn and transform the latent space, ideally rendering the posterior more isotropic and Gaussian-like, suitable for HMC sampling.

Inverse Autoregressive Flows, which are powerful variants of normalizing flows, allow for transformation from simple base distributions to complex target distributions. The process involves minimizing the KL divergence with respect to the warped posterior using these flows, thus facilitating more efficient HMC sampling in the transformed space.

Empirical Evaluation

The effectiveness of NeuTra HMC is appraised on both synthetic datasets and real-world models like sparse logistic regression. Empirical results demonstrate that NeuTra HMC significantly outdoes standard HMC in both the convergence speed to stationarity and in asymptotic effective-sample-size efficiency. Notably, the proposed approach often achieves an order of magnitude improvement in mixing speed, which is especially crucial for high-dimensional sampling problems.

The experiments also underline NeuTra HMC's practical applicability in training latent variable models, such as Variational Autoencoders (VAEs), showcasing improved performance over established baselines. Additionally, while HMC traditionally requires careful tuning of its step size and momentum distributions to handle complex posterior geometries, NeuTra HMC automates this process by leveraging IAF transformations pre-trained on the posterior.

Theoretical Implications and Future Directions

In theoretical terms, NeuTra HMC effectively bridges variational inference and Hamiltonian dynamics by using learned transformations that mimic RMHMC's local geometry adaptations, but without the need for non-trivial metric computations or adjustments per problem instance. This suggests that integrating learned transport maps like IAF within HMC workflows can offer an algorithmic foothold for applying HMC to broader classes of problems with complex posteriors.

This work opens up various avenues for further research: enhancing IAF architectures to improve the expressivity of learned transports, investigating the potential integration of NeuTra HMC with other advanced MCMC kernels, and exploring adaptive schemes for training maps in an online fashion, which would make NeuTra HMC robust to dynamic posterior geometrical changes during iterative Bayesian updating.

Conclusion

NeuTra HMC offers a significant improvement for sampling in the Bayesian context, particularly where complex geometries pose substantial challenges. By jointly leveraging the robustness of Hamiltonian dynamics and the transformative power of deep generative flows, the methodology paves the way for more effective inference in high-dimensional, complex-probability landscapes frequently encountered in both theoretical models and applied machine learning contexts. Future research focusing on scalability and adaptability of the approach may further solidify its position as a mainstay in sampling methodologies.

PDF Markdown

Related Papers

A Conceptual Introduction to Hamiltonian Monte Carlo (2017)
Generalizing Hamiltonian Monte Carlo with Neural Networks (2017)
MCMC using Hamiltonian dynamics (2012)
Hamiltonian Variational Auto-Encoder (2018)
The Geometric Foundations of Hamiltonian Monte Carlo (2014)

YouTube

Show All Videos