Consistency Flow Matching: Defining Straight Flows with Velocity Consistency (2407.02398v1)
Abstract: Flow matching (FM) is a general framework for defining probability paths via Ordinary Differential Equations (ODEs) to transform between noise and data samples. Recent approaches attempt to straighten these flow trajectories to generate high-quality samples with fewer function evaluations, typically through iterative rectification methods or optimal transport solutions. In this paper, we introduce Consistency Flow Matching (Consistency-FM), a novel FM method that explicitly enforces self-consistency in the velocity field. Consistency-FM directly defines straight flows starting from different times to the same endpoint, imposing constraints on their velocity values. Additionally, we propose a multi-segment training approach for Consistency-FM to enhance expressiveness, achieving a better trade-off between sampling quality and speed. Preliminary experiments demonstrate that our Consistency-FM significantly improves training efficiency by converging 4.4x faster than consistency models and 1.7x faster than rectified flow models while achieving better generation quality. Our code is available at: https://github.com/YangLing0818/consistency_flow_matching
- J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” in NeurIPS, vol. 33, pp. 6840–6851, 2020.
- L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, W. Zhang, B. Cui, and M.-H. Yang, “Diffusion models: A comprehensive survey of methods and applications,” ACM Computing Surveys, vol. 56, no. 4, pp. 1–39, 2023.
- L. Yang, Z. Yu, C. Meng, M. Xu, S. Ermon, and B. Cui, “Mastering text-to-image diffusion: Recaptioning, planning, and generating with multimodal llms,” in International Conference on Machine Learning, 2024.
- L. Yang, Z. Zhang, Z. Yu, J. Liu, M. Xu, S. Ermon, and B. CUI, “Cross-modal contextualized diffusion models for text-guided visual generation and editing,” in International Conference on Learning Representations, 2024.
- R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” Advances in neural information processing systems, vol. 31, 2018.
- Y. Song, C. Durkan, I. Murray, and S. Ermon, “Maximum likelihood training of score-based diffusion models,” Advances in neural information processing systems, vol. 34, pp. 1415–1428, 2021.
- Y. Lipman, R. T. Chen, H. Ben-Hamu, M. Nickel, and M. Le, “Flow matching for generative modeling,” in The Eleventh International Conference on Learning Representations, 2022.
- M. S. Albergo and E. Vanden-Eijnden, “Building normalizing flows with stochastic interpolants,” in The Eleventh International Conference on Learning Representations, 2022.
- X. Liu, C. Gong, et al., “Flow straight and fast: Learning to generate and transfer data with rectified flow,” in The Eleventh International Conference on Learning Representations, 2022.
- Y. Song and S. Ermon, “Generative modeling by estimating gradients of the data distribution,” Advances in neural information processing systems, vol. 32, 2019.
- X. Liu, X. Zhang, J. Ma, J. Peng, and Q. Liu, “Instaflow: One step is enough for high-quality diffusion-based text-to-image generation,” arXiv preprint arXiv:2309.06380, 2023.
- N. Kornilov, A. Gasnikov, and A. Korotin, “Optimal flow matching: Learning straight trajectories in just one step,” arXiv preprint arXiv:2403.13117, 2024.
- A. Tong, N. Malkin, G. Huguet, Y. Zhang, J. Rector-Brooks, K. FATRAS, G. Wolf, and Y. Bengio, “Improving and generalizing flow-based generative models with minibatch optimal transport,” in ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023.
- A.-A. Pooladian, H. Ben-Hamu, C. Domingo-Enrich, B. Amos, Y. Lipman, and R. T. Chen, “Multisample flow matching: Straightening flows with minibatch couplings,” 2023.
- Y. Song, P. Dhariwal, M. Chen, and I. Sutskever, “Consistency models,” in International Conference on Machine Learning, pp. 32211–32252, PMLR, 2023.
- B. Nguyen, B. Nguyen, and V. A. Nguyen, “Bellman optimal stepsize straightening of flow-matching models,” in The Twelfth International Conference on Learning Representations, 2024.
- D. Kim, C.-H. Lai, W.-H. Liao, N. Murata, Y. Takida, T. Uesaka, Y. He, Y. Mitsufuji, and S. Ermon, “Consistency trajectory models: Learning probability flow ode trajectory of diffusion,” in The Twelfth International Conference on Learning Representations, 2023.
- L. Klein, A. Krämer, and F. Noe, “Equivariant flow matching,” in Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- H. Stark, B. Jing, C. Wang, G. Corso, B. Berger, R. Barzilay, and T. Jaakkola, “Dirichlet flow matching with applications to dna sequence design,” arXiv preprint arXiv:2402.05841, 2024.
- A. Campbell, J. Yim, R. Barzilay, T. Rainforth, and T. Jaakkola, “Generative flows on discrete state-spaces: Enabling multimodal flows with applications to protein co-design,” arXiv preprint arXiv:2402.04997, 2024.
- A. Makkuva, A. Taghvaei, S. Oh, and J. Lee, “Optimal transport mapping via input convex neural networks,” in International Conference on Machine Learning, pp. 6672–6681, PMLR, 2020.
- M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in International conference on machine learning, pp. 214–223, PMLR, 2017.
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
- D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
- Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” in International Conference on Learning Representations, 2020.
- J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” in International Conference on Learning Representations, 2020.
- D. Rezende and S. Mohamed, “Variational inference with normalizing flows,” in International conference on machine learning, pp. 1530–1538, PMLR, 2015.
- L. Dinh, J. Sohl-Dickstein, and S. Bengio, “Density estimation using real nvp,” in International Conference on Learning Representations, 2016.
- F. Bao, C. Li, J. Zhu, and B. Zhang, “Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models,” in International Conference on Learning Representations, 2021.
- T. Dockhorn, A. Vahdat, and K. Kreis, “Score-based generative modeling with critically-damped langevin diffusion,” in International Conference on Learning Representations, 2021.
- Z. Xiao, K. Kreis, and A. Vahdat, “Tackling the generative learning trilemma with denoising diffusion gans,” in International Conference on Learning Representations, 2021.
- C. Lu, Y. Zhou, F. Bao, J. Chen, C. Li, and J. Zhu, “Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps,” in Advances in Neural Information Processing Systems.
- T. Dockhorn, A. Vahdat, and K. Kreis, “GENIE: Higher-Order Denoising Diffusion Solvers,” Advances in Neural Information Processing Systems, 2022.
- H. Zheng, W. Nie, A. Vahdat, K. Azizzadenesheli, and A. Anandkumar, “Fast sampling of diffusion models via operator learning,” in International Conference on Machine Learning, pp. 42390–42402, PMLR, 2023.
- T. Salimans and J. Ho, “Progressive distillation for fast sampling of diffusion models,” in International Conference on Learning Representations, 2022.
- W. Luo, T. Hu, S. Zhang, J. Sun, Z. Li, and Z. Zhang, “Diff-instruct: A universal approach for transferring knowledge from pre-trained diffusion models,” Advances in Neural Information Processing Systems, vol. 36, 2024.
- W. Luo, “A comprehensive survey on knowledge distillation of diffusion models,” arXiv preprint arXiv:2304.04262, 2023.
- Y. Song and P. Dhariwal, “Improved techniques for training consistency models,” arXiv preprint arXiv:2310.14189, 2023.
- Springer, 2009.
- S. Lee, B. Kim, and J. C. Ye, “Minimizing trajectory curvature of ode-based generative models,” in International Conference on Machine Learning, pp. 18957–18973, PMLR, 2023.
- P. Esser, S. Kulal, A. Blattmann, R. Entezari, J. Müller, H. Saini, Y. Levi, D. Lorenz, A. Sauer, F. Boesel, et al., “Scaling rectified flow transformers for high-resolution image synthesis,” arXiv preprint arXiv:2403.03206, 2024.
- J. C. Butcher, Numerical methods for ordinary differential equations. John Wiley & Sons, 2016.
- K. Alex, “Learning multiple layers of features from tiny images,” https://www. cs. toronto. edu/kriz/learning-features-2009-TR. pdf, 2009.
- T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” arXiv preprint arXiv:1710.10196, 2017.
- Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha, “Stargan v2: Diverse image synthesis for multiple domains,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8188–8197, 2020.
- M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
- A. Vahdat, K. Kreis, and J. Kautz, “Score-based generative modeling in latent space,” Advances in neural information processing systems, vol. 34, pp. 11287–11302, 2021.
- Y. Xu, Z. Liu, M. Tegmark, and T. Jaakkola, “Poisson flow generative models,” Advances in Neural Information Processing Systems, vol. 35, pp. 16782–16795, 2022.
- T. Karras, M. Aittala, T. Aila, and S. Laine, “Elucidating the design space of diffusion-based generative models,” Advances in Neural Information Processing Systems, vol. 35, pp. 26565–26577, 2022.
- D. P. Kingma and P. Dhariwal, “Glow: Generative flow with invertible 1x1 convolutions,” Advances in neural information processing systems, vol. 31, 2018.
- R. T. Chen, J. Behrmann, D. K. Duvenaud, and J.-H. Jacobsen, “Residual flows for invertible generative modeling,” Advances in Neural Information Processing Systems, vol. 32, 2019.
- Z. Xiao, Q. Yan, and Y. Amit, “Generative latent flow,” arXiv preprint arXiv:1905.10485, 2019.
- M. Grcić, I. Grubišić, and S. Šegvić, “Densely connected normalizing flows,” Advances in Neural Information Processing Systems, vol. 34, pp. 23968–23982, 2021.
- R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in CVPR, pp. 10684–10695, 2022.
- D. Podell, Z. English, K. Lacey, A. Blattmann, T. Dockhorn, J. Müller, J. Penna, and R. Rombach, “Sdxl: Improving latent diffusion models for high-resolution image synthesis,” arXiv preprint arXiv:2307.01952, 2023.
- Ling Yang (88 papers)
- Zixiang Zhang (3 papers)
- Zhilong Zhang (20 papers)
- Xingchao Liu (28 papers)
- Minkai Xu (40 papers)
- Wentao Zhang (261 papers)
- Chenlin Meng (39 papers)
- Stefano Ermon (279 papers)
- Bin Cui (165 papers)