Better Training of GFlowNets with Local Credit and Incomplete Trajectories (2302.01687v2)
Abstract: Generative Flow Networks or GFlowNets are related to Monte-Carlo Markov chain methods (as they sample from a distribution specified by an energy function), reinforcement learning (as they learn a policy to sample composed objects through a sequence of steps), generative models (as they learn to represent and sample from a distribution) and amortized variational methods (as they can be used to learn to approximate and sample from an otherwise intractable posterior, given a prior and a likelihood). They are trained to generate an object $x$ through a sequence of steps with probability proportional to some reward function $R(x)$ (or $\exp(-\mathcal{E}(x))$ with $\mathcal{E}(x)$ denoting the energy function), given at the end of the generative trajectory. Like for other RL settings where the reward is only given at the end, the efficiency of training and credit assignment may suffer when those trajectories are longer. With previous GFlowNet work, no learning was possible from incomplete trajectories (lacking a terminal state and the computation of the associated reward). In this paper, we consider the case where the energy function can be applied not just to terminal states but also to intermediate states. This is for example achieved when the energy function is additive, with terms available along the trajectory. We show how to reparameterize the GFlowNet state flow function to take advantage of the partial reward already accrued at each state. This enables a training objective that can be applied to update parameters even with incomplete trajectories. Even when complete trajectories are available, being able to obtain more localized credit and gradients is found to speed up training convergence, as demonstrated across many simulations.
- An introduction to mcmc for machine learning. Machine learning, 50(1):5–43, 2003.
- Baars, B. J. A cognitive theory of consciousness. Cambridge University Press, 1993.
- Baddeley, A. Working memory. Science, 255(5044):556–559, 1992.
- Flow network based generative models for non-iterative diverse candidate generation. Neural Information Processing Systems (NeurIPS), 2021a.
- Better mixing via deep representations. International Conference on Machine Learning (ICML), 2013.
- GFlowNet foundations. arXiv preprint 2111.09266, 2021b.
- The GFlowNet Tutorial. https://milayb.notion.site/The-GFlowNet-Tutorial-95434ef0e2d94c24aab90e69b30be9b3, 2022.
- Cowan, N. An embedded-processes model of working memory. 1999.
- A neuronal model of a global workspace in effortful cognitive tasks. Proceedings of the national Academy of Sciences, 95(24):14529–14534, 1998.
- What is consciousness, and could machines have it? Science, 358(6362):486–492, 2017.
- Bayesian structure learning with generative flow networks. Uncertainty in Artificial Intelligence (UAI), 2022.
- Addressing function approximation error in actor-critic methods. In International conference on machine learning, pp. 1587–1596. PMLR, 2018.
- Generative adversarial nets. Neural Information Processing Systems (NIPS), pp. 2672–2680, 2014.
- Deep learning. MIT press, 2016.
- Reinforcement learning with deep energy-based policies. International Conference on Machine Learning (ICML), 2017.
- Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. International Conference on Machine Learning (ICML), 2018.
- Hastings, W. K. Monte carlo sampling methods using markov chains and their applications. 1970.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- GFlowNet-EM for learning compositional latent variable models. International Conference on Machine Learning (ICML), 2023.
- Biological sequence design with GFlowNets. International Conference on Machine Learning (ICML), 2022a.
- Multi-objective gflownets. arXiv preprint arXiv:2210.12765, 2022b.
- Adam: A method for stochastic optimization. International Conference on Learning Representations (ICLR), 2015.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Auto-encoding variational Bayes. International Conference on Learning Representations (ICLR), 2014.
- A theory of continuous generative flow networks. International Conference on Machine Learning (ICML), 2023.
- Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
- Learning GFlowNets from partial episodes for improved convergence and stability. arXiv preprint 2209.12782, 2022.
- Trajectory balance: Improved credit assignment in GFlowNets. Neural Information Processing Systems (NeurIPS), 2022a.
- Gflownets and variational inference. arXiv preprint arXiv:2210.00580, 2022b.
- Equation of state calculations by fast computing machines. The journal of chemical physics, 21(6):1087–1092, 1953.
- Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
- Bayesian learning of causal structure and mechanisms with GFlowNets and variational bayes. arXiv preprint 2211.02763, 2022.
- Generative augmented flow networks. arXiv preprint 2210.03308, 2022.
- Stochastic generative flow networks. Uncertainty in Artificial Intelligence (UAI), 2023.
- Learning long-term reward redistribution via randomized return decomposition. International Conference on Learning Representations (ICLR), 2022.
- Stochastic backpropagation and approximate inference in deep generative models. International Conference on Machine Learning (ICML), 2014.
- Salakhutdinov, R. Learning in markov random fields using tempered transitions. Neural Information Processing Systems (NIPS), 2009.
- Shanahan, M. A cognitive architecture that combines internal simulation with a global workspace. Consciousness and cognition, 15(2):433–449, 2006.
- Shanahan, M. Embodiment and the inner life: Cognition and Consciousness in the Space of Possible Minds. Oxford University Press, USA, 2010.
- Shanahan, M. The brain’s connective core and its role in animal cognition. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603):2704–2714, 2012.
- Applying global workspace theory to the frame problem. Cognition, 98(2):157–176, 2005.
- Reinforcement learning: An introduction. MIT Press, 2018.
- Attention is all you need. Neural Information Processing Systems (NIPS), 2017.
- Unifying generative models with GFlowNets. arXiv preprint 2209.02606, 2022a.
- Latent state marginalization as a low-cost approach for improving exploration. ArXiv, abs/2210.00999, 2022b.
- Generative flow networks for discrete probabilistic modeling. International Conference on Machine Learning (ICML), 2022c.
- Distributional gflownets with quantile flows. arXiv preprint arXiv:2302.05793, 2023.
- A variational perspective on generative flow networks. arXiv preprint 2210.07992, 2022.