Solving Statistical Mechanics Using Variational Autoregressive Networks (1809.10606v2)

Published 27 Sep 2018 in cond-mat.stat-mech, cond-mat.dis-nn, cs.LG, and stat.ML

Abstract: We propose a general framework for solving statistical mechanics of systems with finite size. The approach extends the celebrated variational mean-field approaches using autoregressive neural networks, which support direct sampling and exact calculation of normalized probability of configurations. It computes variational free energy, estimates physical quantities such as entropy, magnetizations and correlations, and generates uncorrelated samples all at once. Training of the network employs the policy gradient approach in reinforcement learning, which unbiasedly estimates the gradient of variational parameters. We apply our approach to several classic systems, including 2D Ising models, the Hopfield model, the Sherrington-Kirkpatrick model, and the inverse Ising model, for demonstrating its advantages over existing variational mean-field methods. Our approach sheds light on solving statistical physics problems using modern deep generative neural networks.

Citations (163)

View on Semantic Scholar

Summary

The paper leverages autoregressive neural networks to compute variational free energy and estimate macroscopic quantities using a policy gradient approach.
The method outperforms traditional mean-field and Bethe approximations in models like the Ising, Hopfield, and SK models with higher accuracy near critical points.
The approach enables efficient uncorrelated sampling for complex systems, paving the way for scalable applications in physics, Bayesian inference, and optimization.

Solving Statistical Mechanics Using Variational Autoregressive Networks

The paper "Solving Statistical Mechanics Using Variational Autoregressive Networks" authored by Dian Wu, Lei Wang, and Pan Zhang presents a novel variational framework leveraging autoregressive neural networks (VANs) to solve statistical mechanics problems. By capitalizing on the representational power of autoregressive networks, the framework provides an efficient method for computing variational free energy, estimating macroscopic physical quantities, and generating uncorrelated samples, relevant for systems with finite size.

Core Concepts and Methodology

The proposed approach extends traditional variational mean-field methods by employing autoregressive neural networks capable of direct sampling and exact calculation of normalized probabilities. This method models the joint distribution $q_\theta(s)$ as a product of conditional probabilities $\prod_{i=1}^N q_\theta(s_i \mid s_1, \ldots, s_{i-1})$ , which is implemented using neural networks. Such networks can represent complex statistical mechanics systems while preserving computational tractability.

Training involves minimizing the Kullback–Leibler (KL) divergence between the autoregressive model and the target Boltzmann distribution. This is achieved through the policy gradient method typical in reinforcement learning, providing unbiased gradient estimation of the variational parameters towards minimizing the variational free energy $F_q$ .

Numerical Experiments

The authors apply their method to classic models in statistical physics such as the 2D Ising model, the Hopfield model, the Sherrington-Kirkpatrick (SK) model, and more. Results demonstrate the VAN's superiority in accurately estimating free energy, entropy, magnetizations, and correlations across various models.

For instance, VAN significantly outperformed traditional methods like na\"ive mean-field (NMF) and Bethe approximations, particularly near critical temperatures. In the Hopfield and SK models, VAN efficiently captured multiple energy landscape modes and provided accurate estimates without collapsing into single estimations.

Implications and Future Directions

This approach opens the door for applying deep generative models to a broader spectrum of problems within statistical physics and beyond, such as Bayesian inference, combinatorial optimization, and constraint satisfaction problems. The ability to generate independent samples without relying on correlated Markov Chains makes the approach practical for parallel implementation in large-scale computations.

Future research could explore optimizing the network architecture further, integrating ideas from physics and machine learning (e.g., renormalization groups, graph convolution networks) to enhance scalability. Overcoming the current computational costs associated with sampling could be achieved by adopting faster sampling methods like inverse autoregressive flow.

Conclusion

Variational autoregressive networks introduce a powerful, general framework for tackling statistical mechanics problems, offering advantages over existing methods in terms of accuracy, sampling efficiency, and computational feasibility. This innovative application of autoregressive networks highlights a promising direction for integrating AI within theoretical physics, fostering enhanced analytical capability and opening new research vistas.

PDF Markdown