Guided Flows for Generative Modeling and Decision Making (2311.13443v2)
Abstract: Classifier-free guidance is a key component for enhancing the performance of conditional generative models across diverse tasks. While it has previously demonstrated remarkable improvements for the sample quality, it has only been exclusively employed for diffusion models. In this paper, we integrate classifier-free guidance into Flow Matching (FM) models, an alternative simulation-free approach that trains Continuous Normalizing Flows (CNFs) based on regressing vector fields. We explore the usage of \emph{Guided Flows} for a variety of downstream applications. We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text-to-speech synthesis, boasting state-of-the-art performance. Notably, we are the first to apply flow models for plan generation in the offline reinforcement learning setting, showcasing a 10x speedup in computation compared to diffusion models while maintaining comparable performance.
- Is conditional generative modeling all you need for decision-making? arXiv preprint arXiv:2211.15657, 2022.
- Let offline rl flow: Training conservative agents in the latent space of normalizing flows. arXiv preprint arXiv:2211.11096, 2022.
- Building normalizing flows with stochastic interpolants. arXiv preprint arXiv:2209.15571, 2022.
- Richard Bellman. A markovian decision process. Indiana Univ. Math. J., 1957.
- Language models are few-shot learners, 2020.
- Decision transformer: Reinforcement learning via sequence modeling. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=a7APmM4B9d.
- Generative pretraining from pixels. In International conference on machine learning, pages 1691–1703. PMLR, 2020a.
- Wavegrad: Estimating gradients for waveform generation. arXiv preprint arXiv:2009.00713, 2020b.
- Neural ordinary differential equations. Advances in neural information processing systems, 31, 2018.
- Diffusion policy: Visuomotor policy learning via action diffusion. arXiv preprint arXiv:2303.04137, 2023.
- A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819, 2017.
- Flow matching in latent space. arXiv preprint arXiv:2307.08698, 2023.
- Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
- Rvs: What is essential for offline rl via supervised learning? arXiv preprint arXiv:2112.10751, 2021.
- D4rl: Datasets for deep data-driven reinforcement learning. arXiv preprint arXiv:2004.07219, 2020.
- Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
- Latent space editing in transformer-based flow matching. In ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023.
- Offline reinforcement learning as one big sequence modeling problem. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=wgeK563QgSw.
- Planning with diffusion for flexible behavior synthesis. arXiv preprint arXiv:2205.09991, 2022.
- Denoising diffusion restoration models. Advances in Neural Information Processing Systems, 35:23593–23606, 2022.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Variational diffusion models, 2023.
- Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11):1238–1274, 2013.
- Reward-conditioned policies. arXiv preprint arXiv:1912.13465, 2019.
- Voicebox: Text-guided multilingual universal speech generation at scale. Advances in neural information processing systems, 2023.
- Multi-game decision transformers. Advances in Neural Information Processing Systems, 35:27921–27936, 2022.
- Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643, 2020.
- Deep reinforcement learning for dialogue generation. arXiv preprint arXiv:1606.01541, 2016.
- Common diffusion noise schedules and sample steps are flawed. arXiv preprint arXiv:2305.08891, 2023.
- Flow matching for generative modeling. International Conference on Learning Representations, 2023.
- Pretrained transformers as universal computation engines. In Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
- Misra D Mish. A self regularized non-monotonic activation function [j]. arXiv preprint arXiv:1908.08681, 2019.
- Reliable conditioning of behavioral cloning for offline reinforcement learning. arXiv preprint arXiv:2210.05158, 2022.
- Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
- Training-free linear image inversion via flows. arXiv preprint arXiv:2310.04432, 2023.
- Multisample flow matching: Straightening flows with minibatch couplings. arXiv preprint arXiv:2304.14772, 2023.
- Improving language understanding by generative pre-training. 2018.
- A generalist agent. arXiv preprint arXiv:2205.06175, 2022.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022.
- Juergen Schmidhuber. Reinforcement learning upside down: Don’t predict rewards–just map them to actions. arXiv preprint arXiv:1912.02875, 2019.
- Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
- Reinforcement learning for spoken dialogue systems. Advances in neural information processing systems, 12, 1999.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
- Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
- Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
- Training agents using upside-down reinforcement learning. arXiv preprint arXiv:1912.02877, 2019.
- Attention is all you need. In Advances in neural information processing systems, pages 5998–6008, 2017.
- Neural codec language models are zero-shot text to speech synthesizers. arXiv preprint arXiv:2301.02111, 2023.
- Style tokens: Unsupervised style modeling, control and transfer in end-to-end speech synthesis. In International conference on machine learning, pages 5180–5189. PMLR, 2018.
- Diffusion policies as an expressive policy class for offline reinforcement learning. arXiv preprint arXiv:2208.06193, 2022.
- Improving exploration in soft-actor-critic with normalizing flows policies. arXiv preprint arXiv:1906.02771, 2019.
- Group normalization. In Proceedings of the European conference on computer vision (ECCV), pages 3–19, 2018.
- Decision stacks: Flexible reinforcement learning via modular generative models. arXiv preprint arXiv:2306.06253, 2023.
- Online decision transformer. In international conference on machine learning, pages 27042–27059. PMLR, 2022.
- Semi-supervised offline reinforcement learning with action-free trajectories. In International conference on machine learning, pages 42339–42362. PMLR, 2023.
- Qinqing Zheng (20 papers)
- Matt Le (11 papers)
- Neta Shaul (9 papers)
- Yaron Lipman (55 papers)
- Aditya Grover (82 papers)
- Ricky T. Q. Chen (53 papers)