Discrete Probabilistic Inference as Control in Multi-path Environments (2402.10309v2)
Abstract: We consider the problem of sampling from a discrete and structured distribution as a sequential decision problem, where the objective is to find a stochastic policy such that objects are sampled at the end of this sequential process proportionally to some predefined reward. While we could use maximum entropy Reinforcement Learning (MaxEnt RL) to solve this problem for some distributions, it has been shown that in general, the distribution over states induced by the optimal policy may be biased in cases where there are multiple ways to generate the same object. To address this issue, Generative Flow Networks (GFlowNets) learn a stochastic policy that samples objects proportionally to their reward by approximately enforcing a conservation of flows across the whole Markov Decision Process (MDP). In this paper, we extend recent methods correcting the reward in order to guarantee that the marginal distribution induced by the optimal MaxEnt RL policy is proportional to the original reward, regardless of the structure of the underlying MDP. We also prove that some flow-matching objectives found in the GFlowNet literature are in fact equivalent to well-established MaxEnt RL algorithms with a corrected reward. Finally, we study empirically the performance of multiple MaxEnt RL and GFlowNet algorithms on multiple problems involving sampling from discrete distributions.
- Model-based reinforcement learning for biological sequence design. International Conference on Learning Representations, 2020.
- Anonymous. GFlowNet Training by Policy Gradients. OpenReview, 2023.
- DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks. Advances in Neural Information Processing Systems, 2023.
- Data Generation as Sequential Decision Making. Advances in Neural Information Processing Systems, 2015.
- Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation. Advances in Neural Information Processing Systems, 2021.
- GFlowNet Foundations. Journal of Machine Learning Research, 2023.
- Training Diffusion Models with Reinforcement Learning. arXiv preprint, 2023.
- Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions. In International Conference on Artificial Intelligence and Statistics. PMLR, 2020.
- Petros Christodoulou. Soft Actor-Critic for Discrete Action Settings. arXiv preprint, 2019.
- Bayesian Structure Learning with Generative Flow Networks. Uncertainty in Artificial Intelligence, 2022.
- Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network. Advances in Neural Information Processing Systems, 2023.
- Maximum Entropy RL (Provably) Solves Some Robust RL Problems. International Conference on Learning Representations, 2022.
- Delta-AI: Local objectives for amortized inference in sparse graphical models. International Conference on Learning Representations, 2024.
- Optimizing ddpm sampling with shortcut fine-tuning. International Conference on Machine Learning, 2023.
- Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization. Workshop on Machine Learning in Structural Biology, NeurIPS, 2022.
- Taming the Noise in Reinforcement Learning via Soft Updates. Uncertainty in Artificial Intelligence, 2016.
- Learning Gaussian Networks. Uncertainty in Artificial Intelligence, 1994.
- A Theory of Regularized Markov Decision Processes. International Conference on Machine Learning, 2019.
- Sampling-based approaches to calculating marginal densities. Journal of the American statistical association, 1990.
- Reinforcement Learning with Deep Energy-Based Policies. International Conference on Machine Learning, 2017.
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In International Conference in Machine Learning, 2018a.
- Soft Actor-Critic Algorithms and Applications. Arxiv preprint, 2018b.
- W. Keith Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 1970.
- Multi-Fidelity Active Learning with GFlowNets. arXiv preprint, 2023.
- GFlowNet-EM for Learning Compositional Latent Variable Models. International Conference on Machine Learning, 2023.
- Amortizing intractable inference in large language models. International Conference on Learning Representations, 2024.
- Biological Sequence Design with GFlowNets. International Conference on Machine Learning, 2022.
- GFlowNets for AI-Driven Scientific Discovery. Digital Discovery, 2023a.
- Multi-Objective GFlowNets. International Conference on Machine Learning, 2023b.
- Categorical Reparameterization with Gumbel-Softmax. In International Conference on Learning Representations, 2017.
- Expected flow networks in stochastic environments and two-player zero-sum games. International Conference on Learning Representations, 2024.
- Auto-Encoding Variational Bayes. International Conference on Learning Representations, 2014.
- A Theory of Continuous Generative Flow Networks. International Conference on Machine Learning, 2023.
- Sergey Levine. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. arXiv preprint, 2018.
- CFlowNets: Continuous control with Generative Flow Networks. International Conference on Learning Representations, 2023.
- Learning GFlowNets from partial episodes for improved convergence and stability. International Conference on Machine Learning, 2023.
- The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables. In International Conference on Learning Representations, 2017.
- Trajectory balance: Improved credit assignment in GFlowNets. Advances in Neural Information Processing Systems, 2022.
- GFlowNets and variational inference. International Conference on Learning Representations, 2023.
- Learning Latent Permutations with Gumbel-Sinkhorn Networks. In International Conference on Learning Representations, 2018.
- Crystal-GFN: sampling crystals with desirable properties and constraints. arXiv preprint, 2023.
- Monte Carlo Gradient Estimation in Machine Learning. In Journal of Machine Learning Research, 2020.
- Maximum entropy GFlowNets with soft Q-learning. International Conference on Artificial Intelligence and Statistics (AISTATS), 2024.
- Bridging the Gap Between Value and Policy Based Reinforcement Learning. Advances in Neural Information Processing Systems, 2017.
- Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping. International Conference on Machine Learning, 1999.
- Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes. AAAI Workshop Graphs and More Complex Structures for Learning and Reasoning, 2023.
- Better Training of GFlowNets with Local Credit and Incomplete Trajectories. International Conference on Machine Learning, 2023a.
- Stochastic Generative Flow Networks. Uncertainty in Artificial Intelligence, 2023b.
- Thompson sampling for improved exploration in GFlowNets. ICML 2023 workshop on Structured Probabilistic Inference & Generative Modeling, 2023.
- Variational Inference with Normalizing Flows. In International Conference on Machine Learning, 2015.
- Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In International Conference on Machine Learning, 2014.
- Equivalence Between Policy Gradients and Soft Q-Learning. arXiv preprint, 2017.
- On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling. arXiv preprint, 2024.
- Towards Understanding and Improving GFlowNet Training. International Conference on Machine Learning, 2023.
- Reinforcement learning: An introduction. MIT press, 2018.
- Generative Flow Networks as Entropy-Regularized RL. International Conference on Artificial Intelligence and Statistics (AISTATS), 2024.
- An Empirical Study of the Effectiveness of Using a Replay Buffer on Mode Discovery in GFlowNets. ICML 2023 workshop on Structured Probabilistic Inference & Generative Modeling, 2023.
- Dueling Network Architectures for Deep Reinforcement Learning. International Conference on Machine Learning, 2016.
- Reinforced variational inference. Advances in Neural Information Processing Systems (NIPS) Workshops, 2015.
- Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. In Machine Learning, 1992.
- Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation. Advances in Neural Information Processing Systems, 2018.
- Robust Scheduling with GFlowNets . International Conference on Learning Representations, 2023a.
- Unifying Generative Models with GFlowNets and Beyond. International Conference on Machine Learning – Beyond Bayes workshop, 2022a.
- Generative Flow Networks for Discrete Probabilistic Modeling. International Conference on Machine Learning, 2022b.
- Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets. Advances in Neural Information Processing Systems, 2023b.
- PhyloGFN: Phylogenetic inference with generative flow networks. International Conference on Machine Learning, 2024.
- Brian D Ziebart. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy. Carnegie Mellon University, 2010.
- Maximum Entropy Inverse Reinforcement Learning. AAAI Conference on Artificial Intelligence, 2008.
- A Variational Perspective on Generative Flow Networks. Transactions of Machine Learning Research, 2022.