Iterated Denoising Energy Matching for Sampling from Boltzmann Densities (2402.06121v2)
Abstract: Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and no data samples -- to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is simulation-free, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant $n$-body particle systems. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5\times$ faster, which allows it to be the first method to train using energy on the challenging $55$-particle Lennard-Jones system.
- Building normalizing flows with stochastic interpolants. International Conference on Learning Representations (ICLR), 2023.
- Flow-based generative models for markov chain monte carlo in lattice field theory. Physical Review D, 100(3):034515, 2019.
- Self-consuming generative models go mad. International Conference on Learning Representations (ICLR), 2023.
- An optimal control perspective on diffusion-based generative modeling. arXiv preprint arXiv:2211.01364, 2022.
- On the stability of iterative retraining of generative models on their own data. International Conference on Learning Representations (ICLR), 2023.
- Equivariant finite normalizing flows. arXiv preprint arXiv:2110.08649, 2021.
- SE(3)-stochastic flow matching for protein backbone generation. International Conference on Learning Representations (ICLR), 2024.
- EDGI: Equivariant diffusion for planning with embodied agents. Neural Information Processing Systems (NeurIPS), 2023a.
- Geometric algebra transformer. Neural Information Processing Systems (NeurIPS), 2023b.
- Adaptive importance sampling: The past, the present, and the future. IEEE Signal Processing Magazine, 34(4):60–79, 2017.
- Neural ordinary differential equations. Neural Information Processing Systems (NIPS), 2018.
- Dai Pra, P. A stochastic control approach to reciprocal diffusion processes. Applied mathematics and Optimization, 23:313–329, 1991.
- Diffusion schrödinger bridge with applications to score-based generative modeling. Neural Information Processing Systems (NeurIPS), 2021.
- Sequential Monte Carlo samplers. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(3):411–436, 2006.
- Density estimation using Real NVP. International Conference on Learning Representations (ICLR), 2017.
- Score-based diffusion meets annealed importance sampling. Neural Information Processing Systems (NeurIPS), 2022.
- Importance nested sampling and the MultiNest algorithm. arXiv preprint arXiv:1306.2144, 2013.
- Pot: Python optimal transport. Journal of Machine Learning Research, 22(78):1–8, 2021. URL http://jmlr.org/papers/v22/20-451.html.
- E(n) equivariant normalizing flows. Neural Information Processing Systems (NeurIPS), 2021.
- MCMC variational inference via uncorrected Hamiltonian annealing. Neural Information Processing Systems (NeurIPS), 2021.
- Langevin diffusion variational inference. Artificial Intelligence and Statistics (AISTATS), 2023.
- Representations of knowledge in complex systems. Journal of the Royal Statistical Society: Series B (Methodological), 56(4):549–581, 1994.
- On sampling with approximate transport maps. arXiv preprint arXiv:2302.04763, 2023.
- Polychord: nested sampling for cosmology. Monthly Notices of the Royal Astronomical Society: Letters, 450(1):L61–L65, 2015.
- Hastings, W. K. Monte carlo sampling methods using markov chains and their applications. 1970.
- Denoising diffusion probabilistic models. Neural Information Processing Systems (NeurIPS), 2020.
- The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo. Journal of Machine Learning Research, 15(1):1593–1623, 2014.
- Equivariant diffusion for molecule generation in 3d. International Conference on Machine Learning (ICML), 2022.
- Reverse diffusion Monte Carlo. International Conference on Learning Representations (ICLR), 2024.
- Equivariant 3d-conditional diffusion models for molecular linker design. International Conference on Learning Representations (ICLR), 2022.
- Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
- Optimization by simulated annealing. science, 220(4598):671–680, 1983.
- Timewarp: Transferable acceleration of molecular dynamics by learning time-coarsened dynamics. Neural Information Processing Systems (NeurIPS), 2023a.
- Equivariant flow matching. Neural Information Processing Systems (NeurIPS), 2023b.
- Equivariant flows: exact likelihood generative learning for symmetric densities. International Conference on Machine Learning (ICML), 2020.
- Rigid body flows for sampling molecular crystal structures. International Conference on Machine Learning (ICML), 2023.
- A theory of continuous generative flow networks. International Conference on Machine Learning (ICML), 2023.
- Rational construction of stochastic numerical methods for molecular sampling. Applied Mathematics Research eXpress, 2013(1):34–56, 2013.
- Improving gradient-guided nested sampling for posterior inference. arXiv preprint arXiv:2312.03911, 2023.
- Neural network renormalization group. Physical review letters, 121(26):260601, 2018.
- Flow matching for generative modeling. International Conference on Learning Representations (ICLR), 2023.
- Liu, Q. Rectified flow: A marginal preserving approach to optimal transport. arXiv preprint arXiv:2209.14577, 2022.
- GFlowNets and variational inference. International Conference on Learning Representations (ICLR), 2023.
- Continual repeated annealed flow transport monte carlo. International Conference on Machine Learning (ICML), 2022.
- Equation of state calculations by fast computing machines. The journal of chemical physics, 21(6):1087–1092, 1953.
- SE(3) equivariant augmented coupling flows. Neural Information Processing Systems (NeurIPS), 2023a.
- Flow annealed importance sampling bootstrap. International Conference on Learning Representations (ICLR), 2023b.
- Neal, R. M. Annealed importance sampling. Statistics and computing, 11:125–139, 2001.
- Neal, R. M. Slice sampling. The annals of statistics, 31(3):705–767, 2003.
- Neal, R. M. et al. MCMC using Hamiltonian dynamics. Handbook of Markov chain Monte Carlo, 2(11):2, 2011.
- Asymptotically unbiased estimation of physical observables with neural samplers. Physical Review E, 101(2):023304, 2020.
- Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science, 365(6457):eaaw1147, 2019.
- Owen, A. B. Monte Carlo theory, methods and examples. https://artowen.su.domains/mc/, 2013.
- Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research, 22(1):2617–2680, 2021.
- Pavon, M. Stochastic control and nonequilibrium thermodynamical systems. Applied Mathematics and Optimization, 19:187–202, 1989.
- Variational inference with normalizing flows. International Conference on Machine Learning (ICML), 2015.
- Improved sampling via learned diffusions. International Conference on Learning Representations (ICLR), 2024.
- Monte Carlo statistical methods, volume 2. Springer, 1999.
- Optimal scaling of discrete approximations to langevin diffusions. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(1):255–268, 1998.
- Exponential convergence of langevin distributions and their discrete approximations. Bernoulli, pp. 341–363, 1996.
- E (n) equivariant graph neural networks. International Conference on Machine Learning (ICML), 2021.
- Skilling, J. Nested sampling for general Bayesian computation. 2006.
- Deep unsupervised learning using nonequilibrium thermodynamics. International Conference on Machine Learning (ICML), 2015.
- Loss-guided diffusion models for plug-and-play controllable generation. International Conference on Machine Learning (ICML), 2023.
- Score-based generative modeling through stochastic differential equations. International Conference on Learning Representations (ICLR), 2021.
- Speagle, J. S. DYNESTY: a dynamic nested sampling package for estimating Bayesian posteriors and evidences. Monthly Notices of the Royal Astronomical Society, 493(3):3132–3158, 2020.
- Monte Carlo variational auto-encoders. International Conference on Machine Learning (ICML), 2021.
- Tieleman, T. Training restricted boltzmann machines using approximations to the likelihood gradient. International Conference on Machine Learning (ICML), 2008.
- Improving and generalizing flow-based generative models with minibatch optimal transport. arXiv preprint arXiv:2302.00482, 2023.
- Neural stochastic differential equations: Deep latent Gaussian models in the diffusion limit. arXiv preprint arXiv:1905.09883, 2019a.
- Theoretical guarantees for sampling and inference in generative models with latent diffusions. Conference on Learning Theory (CoLT), 2019b.
- Denoising diffusion samplers. International Conference on Learning Representations (ICLR), 2023.
- Transport meets variational inference: Controlled Monte Carlo diffusions. International Conference on Learning Representations (ICLR), 2024.
- Vershynin, R. High-dimensional probability: An introduction with applications in data science. Cambridge university press, 2018.
- Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2):1–305, 2008.
- Stochastic normalizing flows. Neural Information Processing Systems (NeurIPS), 2020.
- Geodiff: A geometric diffusion model for molecular conformation generation. arXiv preprint arXiv:2203.02923, 2022.
- Fast protein backbone generation with SE(3) flow matching. arXiv preprint arXiv:2310.05297, 2023a.
- SE(3) diffusion model with application to protein backbone generation. International Conference on Machine Learning (ICML), 2023b.
- Unifying generative models with GFlowNets and beyond. arXiv preprint arXiv:2209.02606v2, 2023.
- Path integral sampler: a stochastic control approach for sampling. International Conference on Learning Representations (ICLR), 2022.