Bellman Optimal Stepsize Straightening of Flow-Matching Models (2312.16414v3)
Abstract: Flow matching is a powerful framework for generating high-quality samples in various applications, especially image synthesis. However, the intensive computational demands of these models, especially during the finetuning process and sampling processes, pose significant challenges for low-resource scenarios. This paper introduces BeLLMan Optimal Stepsize Straightening (BOSS) technique for distilling flow-matching generative models: it aims specifically for a few-step efficient image sampling while adhering to a computational budget constraint. First, this technique involves a dynamic programming algorithm that optimizes the stepsizes of the pretrained network. Then, it refines the velocity network to match the optimal step sizes, aiming to straighten the generation paths. Extensive experimental evaluations across image generation tasks demonstrate the efficacy of BOSS in terms of both resource utilization and image quality. Our results reveal that BOSS achieves substantial gains in efficiency while maintaining competitive sample quality, effectively bridging the gap between low-resource constraints and the demanding requirements of flow-matching generative models. Our paper also fortifies the responsible development of artificial intelligence, offering a more sustainable generative model that reduces computational costs and environmental footprints. Our code can be found at https://github.com/nguyenngocbaocmt02/BOSS.
- Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.
- Building normalizing flows with stochastic interpolants. In The Eleventh International Conference on Learning Representations, 2022.
- Analytic-DPM: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. arXiv preprint arXiv:2201.06503, 2022.
- Neural ordinary differential equations. Advances in Neural Information Processing Systems, 31, 2018.
- GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Advances in Neural Information Processing Systems, 30, 2017.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
- Progressive growing of GANs for improved quality, stability, and variation. In International Conference on Learning Representations, 2018.
- Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
- Alex Krizhevsky et al. Learning multiple layers of features from tiny images. cs.toronto.edu, 2009.
- Autodiffusion: Training-free optimization of time steps and architectures for automated diffusion model acceleration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7105–7114, 2023.
- Flow matching for generative modeling. In The Eleventh International Conference on Learning Representations, 2022.
- Pseudo numerical methods for diffusion models on manifolds. arXiv preprint arXiv:2202.09778, 2022a.
- Flow straight and fast: Learning to generate and transfer data with rectified flow. arXiv preprint arXiv:2209.03003, 2022b.
- Instaflow: One step is enough for high-quality diffusion-based text-to-image generation. arXiv preprint arXiv:2309.06380, 2023.
- DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
- On distillation of guided diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14297–14306, 2023.
- Action matching: Learning stochastic dynamics from samples. In Proceedings of the 40th International Conference on Machine Learning, 2023.
- On aliased resizing and surprising subtleties in GAN evaluation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Hierarchical text-conditional image generation with CLIP latents. arXiv preprint arxiv:2204.06125, 7, 2022.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695, 2022.
- Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
- Progressive distillation for fast sampling of diffusion models. In International Conference on Learning Representations, 2021.
- Steven S. Skiena. The Algorithm Design Manual. Springer, 2008.
- Denoising diffusion implicit models. In International Conference on Learning Representations, 2020a.
- Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020b.
- Quasi-Taylor samplers for diffusion generative models based on ideal derivatives. arXiv preprint arXiv:2112.13339, 2021.
- SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17(3):261–272, 2020.
- Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
- Fast sampling of diffusion models with exponential integrator. In The Eleventh International Conference on Learning Representations, 2022.
- DPM-solver-v3: Improved diffusion ODE solver with empirical model statistics. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.