Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization (2310.02679v3)
Abstract: We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics. We extend recent sampling-based approaches that leverage controlled stochastic processes to model approximate samples from these target densities. The main drawback of these approaches is that the training objective requires full trajectories to compute, resulting in sluggish credit assignment issues due to use of entire trajectories and a learning signal present only at the terminal time. In this work, we present Diffusion Generative Flow Samplers (DGFS), a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments, via parameterizing an additional "flow function". Our method takes inspiration from the theory developed for generative flow networks (GFlowNets), allowing us to make use of intermediate learning signals. Through various challenging experiments, we demonstrate that DGFS achieves more accurate estimates of the normalization constant than closely-related prior methods.
- Flow-based generative models for markov chain monte carlo in lattice field theory. ArXiv, abs/1904.12072, 2019. URL https://api.semanticscholar.org/CorpusID:139104868.
- An introduction to mcmc for machine learning. Machine Learning, 50:5–43, 2004. URL https://api.semanticscholar.org/CorpusID:38363.
- Annealed flow transport monte carlo. ArXiv, abs/2102.07501, 2021. URL https://api.semanticscholar.org/CorpusID:231925352.
- Dyngfn: Bayesian dynamic causal discovery using generative flow networks. ArXiv, abs/2302.04178, 2023. URL https://api.semanticscholar.org/CorpusID:256662181.
- Flow network based generative models for non-iterative diverse candidate generation. Neural Information Processing Systems (NeurIPS), 2021.
- GFlowNet foundations. Journal of Machine Learning Research, (24):1–76, 2023.
- An optimal control perspective on diffusion-based generative modeling. ArXiv, abs/2211.01364, 2022. URL https://api.semanticscholar.org/CorpusID:253255370.
- Variational inference: A review for statisticians. ArXiv, abs/1601.00670, 2016. URL https://api.semanticscholar.org/CorpusID:3554631.
- Valentin De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. ArXiv, abs/2208.05314, 2022. URL https://api.semanticscholar.org/CorpusID:251468296.
- Handbook of markov chain monte carlo. 2011. URL https://api.semanticscholar.org/CorpusID:60438653.
- Importance weighted autoencoders. International Conference on Learning Representations (ICLR), 2016.
- Stochastic normalizing flows as non-equilibrium transformations. Journal of High Energy Physics, 2022, 2022. URL https://api.semanticscholar.org/CorpusID:246240579.
- Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. ArXiv, abs/2209.11215, 2022. URL https://api.semanticscholar.org/CorpusID:252438904.
- Neural ordinary differential equations. In Neural Information Processing Systems, 2018. URL https://api.semanticscholar.org/CorpusID:49310446.
- Stochastic gradient hamiltonian monte carlo. In International Conference on Machine Learning, 2014. URL https://api.semanticscholar.org/CorpusID:3228832.
- Neural approximate sufficient statistics for implicit models. ArXiv, abs/2010.10079, 2020. URL https://api.semanticscholar.org/CorpusID:224804162.
- Continual repeated annealed flow transport monte carlo. ArXiv, abs/2201.13117, 2022. URL https://api.semanticscholar.org/CorpusID:246430223.
- Bayesian structure learning with generative flow networks. Uncertainty in Artificial Intelligence (UAI), 2022.
- Joint Bayesian inference of graphical structure and parameters with a single generative flow network. arXiv preprint arXiv:2305.19366, 2023.
- Tempered markov chain monte carlo for training of restricted boltzmann machines. In International Conference on Artificial Intelligence and Statistics, 2010. URL https://api.semanticscholar.org/CorpusID:16382829.
- Nice: Non-linear independent components estimation. CoRR, abs/1410.8516, 2014. URL https://api.semanticscholar.org/CorpusID:13995862.
- Sequential monte carlo methods in practice. In Statistics for Engineering and Information Science, 2001. URL https://api.semanticscholar.org/CorpusID:30176573.
- Score-based diffusion meets annealed importance sampling. ArXiv, abs/2208.07698, 2022. URL https://api.semanticscholar.org/CorpusID:251594572.
- A flexible diffusion model. ArXiv, abs/2206.10365, 2022. URL https://api.semanticscholar.org/CorpusID:249889644.
- Hybrid monte carlo. 1987. URL https://api.semanticscholar.org/CorpusID:121101759.
- Understanding molecular simulation: from algorithms to applications. 1996. URL https://api.semanticscholar.org/CorpusID:62612337.
- Adaptive monte carlo augmented with normalizing flows. Proceedings of the National Academy of Sciences of the United States of America, 119, 2021. URL https://api.semanticscholar.org/CorpusID:235195952.
- i-flow: High-dimensional integration and sampling with normalizing flows. Machine Learning: Science and Technology, 1, 2020. URL https://api.semanticscholar.org/CorpusID:210701113.
- Langevin diffusion variational inference. In International Conference on Artificial Intelligence and Statistics, 2022. URL https://api.semanticscholar.org/CorpusID:251594511.
- Multi-fidelity active learning with gflownets. ArXiv, abs/2306.11715, 2023. URL https://api.semanticscholar.org/CorpusID:259202477.
- Probabilistic backpropagation for scalable learning of bayesian neural networks. In International Conference on Machine Learning, 2015. URL https://api.semanticscholar.org/CorpusID:8645175.
- Denoising diffusion probabilistic models. ArXiv, abs/2006.11239, 2020a. URL https://api.semanticscholar.org/CorpusID:219955663.
- Denoising diffusion probabilistic models. Neural Information Processing Systems (NeurIPS), 2020b.
- The no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo. J. Mach. Learn. Res., 15:1593–1623, 2011. URL https://api.semanticscholar.org/CorpusID:12948548.
- Stochastic optimal control for collective variable free sampling of molecular transition paths. 2022. URL https://api.semanticscholar.org/CorpusID:259951301.
- Molecular dynamics simulation for all. Neuron, 99:1129–1143, 2018. URL https://api.semanticscholar.org/CorpusID:52311344.
- GFlowNet-EM for learning compositional latent variable models. International Conference on Machine Learning (ICML), 2023.
- Biological sequence design with GFlowNets. International Conference on Machine Learning (ICML), 2022.
- Gflownets for ai-driven scientific discovery. ArXiv, abs/2302.00615, 2023a. URL https://api.semanticscholar.org/CorpusID:256459319.
- Multi-objective GFlowNets. International Conference on Machine Learning (ICML), 2023b.
- Hilbert J Kappen. Path integrals and symmetry breaking for optimal control theory. Journal of statistical mechanics: theory and experiment, 2005(11):P11011, 2005.
- Bias-variance error bounds for temporal difference updates. In Annual Conference Computational Learning Theory, 2000. URL https://api.semanticscholar.org/CorpusID:5053575.
- Auto-encoding variational Bayes. International Conference on Learning Representations (ICLR), 2014.
- A theory of continuous generative flow networks. International Conference on Machine Learning (ICML), 2023.
- Convergence of score-based generative modeling for general data distributions. ArXiv, abs/2209.12381, 2022. URL https://api.semanticscholar.org/CorpusID:252531877.
- Dag matters! gflownets enhanced explainer for graph neural networks. ArXiv, abs/2303.02448, 2023a. URL https://api.semanticscholar.org/CorpusID:257365860.
- CFlowNets: Continuous control with generative flow networks. International Conference on Learning Representations (ICLR), 2023b.
- Gflowout: Dropout with generative flow networks. In International Conference on Machine Learning, 2022. URL https://api.semanticscholar.org/CorpusID:253097963.
- Jun S. Liu. Monte carlo strategies in scientific computing. 2001. URL https://api.semanticscholar.org/CorpusID:62226424.
- Learning GFlowNets from partial episodes for improved convergence and stability. International Conference on Machine Learning (ICML), 2022.
- GFlowNets and variational inference. International Conference on Learning Representations (ICLR), 2023.
- Flow annealed importance sampling bootstrap. ArXiv, abs/2208.01893, 2022. URL https://api.semanticscholar.org/CorpusID:251280102.
- Thomas P. Minka. Expectation propagation for approximate Bayesian inference. arXiv preprint arXiv:1301.2294, 2001.
- Log gaussian cox processes. Scandinavian Journal of Statistics, 25, 1998. URL https://api.semanticscholar.org/CorpusID:120543073.
- Pierre Del Moral and A. Doucet. Sequential monte carlo samplers. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68, 2002. URL https://api.semanticscholar.org/CorpusID:12074789.
- Neural importance sampling. ACM Transactions on Graphics (TOG), 38:1 – 19, 2018. URL https://api.semanticscholar.org/CorpusID:51970108.
- Radford M. Neal. Bayesian learning for neural networks. 1995. URL https://api.semanticscholar.org/CorpusID:251471788.
- Radford M. Neal. Annealed importance sampling. Statistics and Computing, 11:125–139, 1998. URL https://api.semanticscholar.org/CorpusID:11112994.
- Radford M. Neal. Slice sampling. The Annals of Statistics, 31(3), Jun 2003. ISSN 0090-5364. doi: 10.1214/aos/1056562461. URL http://dx.doi.org/10.1214/aos/1056562461.
- Asymptotically unbiased estimation of physical observables with neural samplers. Physical review. E, 101 2-1:023304, 2020. URL https://api.semanticscholar.org/CorpusID:211096944.
- Estimation of thermodynamic observables in lattice field theories with deep generative models. Physical review letters, 126 3:032001, 2021. URL https://api.semanticscholar.org/CorpusID:220514403.
- Detecting and mitigating mode-collapse for flow-based sampling of lattice field theories. ArXiv, abs/2302.14082, 2023. URL https://api.semanticscholar.org/CorpusID:257232703.
- Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science, 365, 2019. URL https://api.semanticscholar.org/CorpusID:54458652.
- Better training of GFlowNets with local credit and incomplete trajectories. International Conference on Machine Learning (ICML), 2023a.
- Generative augmented flow networks. International Conference on Learning Representations (ICLR), 2023b.
- Stochastic generative flow networks. Uncertainty in Artificial Intelligence (UAI), 2023c.
- Variational inference with normalizing flows. ArXiv, abs/1505.05770, 2015. URL https://api.semanticscholar.org/CorpusID:12554042.
- Stochastic backpropagation and approximate inference in deep generative models. International Conference on Machine Learning (ICML), 2014.
- Improved sampling via learned diffusions. ArXiv, abs/2307.01198, 2023. URL https://api.semanticscholar.org/CorpusID:259316542.
- Sticking the landing: Simple, lower-variance gradient estimators for variational inference. In NIPS, 2017. URL https://api.semanticscholar.org/CorpusID:30321501.
- Towards understanding and improving gflownet training. ArXiv, abs/2305.07170, 2023. URL https://api.semanticscholar.org/CorpusID:258676487.
- Deep unsupervised learning using nonequilibrium thermodynamics. ArXiv, abs/1503.03585, 2015. URL https://api.semanticscholar.org/CorpusID:14888175.
- Score-based generative modeling through stochastic differential equations. ArXiv, abs/2011.13456, 2020. URL https://api.semanticscholar.org/CorpusID:227209335.
- Reinforcement learning: An introduction. MIT press, 2018.
- Using fast weights to improve persistent contrastive divergence. In International Conference on Machine Learning, 2009. URL https://api.semanticscholar.org/CorpusID:415956.
- Theoretical guarantees for sampling and inference in generative models with latent diffusions. In Annual Conference Computational Learning Theory, 2019. URL https://api.semanticscholar.org/CorpusID:71148500.
- Gradients should stay on path: better estimators of the reverse- and forward kl divergence for normalizing flows. Machine Learning: Science and Technology, 3, 2022a. URL https://api.semanticscholar.org/CorpusID:250626584.
- Path-gradient estimators for continuous normalizing flows. ArXiv, abs/2206.09016, 2022b. URL https://api.semanticscholar.org/CorpusID:249890240.
- Transport, variational inference and diffusions: with applications to annealed flows and schrödinger bridges. ArXiv, abs/2307.01050, 2023. URL https://api.semanticscholar.org/CorpusID:259317037.
- Bayesian learning via neural schrödinger–föllmer flows. Statistics and Computing, 33, 2021. URL https://api.semanticscholar.org/CorpusID:244477794.
- Denoising diffusion samplers. ArXiv, abs/2302.13834, 2023a. URL https://api.semanticscholar.org/CorpusID:257220165.
- Expressiveness remarks for denoising diffusion models and samplers. ArXiv, abs/2305.09605, 2023b. URL https://api.semanticscholar.org/CorpusID:258715268.
- Pascal Vincent. A connection between score matching and denoising autoencoders. Neural Computation, 23:1661–1674, 2011. URL https://api.semanticscholar.org/CorpusID:5560643.
- Bayesian learning via stochastic gradient langevin dynamics. In International Conference on Machine Learning, 2011. URL https://api.semanticscholar.org/CorpusID:2178983.
- Targeted free energy estimation via learned mappings. The Journal of chemical physics, 153 14:144112, 2020. URL https://api.semanticscholar.org/CorpusID:211082968.
- Solving statistical mechanics using variational autoregressive networks. Physical review letters, 122 8:080602, 2018. URL https://api.semanticscholar.org/CorpusID:52879890.
- Stochastic normalizing flows. ArXiv, abs/2002.06707, 2020. URL https://api.semanticscholar.org/CorpusID:211133098.
- Infinitely deep bayesian neural networks with stochastic differential equations. In International Conference on Artificial Intelligence and Statistics, pp. 721–738. PMLR, 2022.
- Robust scheduling with gflownets. ArXiv, abs/2302.05446, 2023a. URL https://api.semanticscholar.org/CorpusID:256827133.
- Unifying likelihood-free inference with black-box optimization and beyond. In International Conference on Learning Representations, 2021. URL https://api.semanticscholar.org/CorpusID:246680129.
- Unifying generative models with GFlowNets and beyond. arXiv preprint arXiv:2209.02606v2, 2022a.
- Generative flow networks for discrete probabilistic modeling. International Conference on Machine Learning (ICML), 2022b.
- Let the flows tell: Solving graph combinatorial optimization problems with gflownets. ArXiv, abs/2305.17010, 2023b. URL https://api.semanticscholar.org/CorpusID:258947700.
- Distributional gflownets with quantile flows. arXiv preprint arXiv:2302.05793, 2023c.
- Path integral sampler: a stochastic control approach for sampling. International Conference on Learning Representations (ICLR), 2022.
- A variational perspective on generative flow networks. ArXiv, abs/2210.07992, 2022. URL https://api.semanticscholar.org/CorpusID:252907672.