Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment (2506.02221v1)

Published 2 Jun 2025 in cs.CV and cs.LG

Abstract: Diffusion models have revolutionized generative tasks through high-fidelity outputs, yet flow matching (FM) offers faster inference and empirical performance gains. However, current foundation FM models are computationally prohibitive for finetuning, while diffusion models like Stable Diffusion benefit from efficient architectures and ecosystem support. This work addresses the critical challenge of efficiently transferring knowledge from pre-trained diffusion models to flow matching. We propose Diff2Flow, a novel framework that systematically bridges diffusion and FM paradigms by rescaling timesteps, aligning interpolants, and deriving FM-compatible velocity fields from diffusion predictions. This alignment enables direct and efficient FM finetuning of diffusion priors with no extra computation overhead. Our experiments demonstrate that Diff2Flow outperforms na\"ive FM and diffusion finetuning particularly under parameter-efficient constraints, while achieving superior or competitive performance across diverse downstream tasks compared to state-of-the-art methods. We will release our code at https://github.com/CompVis/diff2flow.

Summary

The paper introduces Diff2Flow, a framework that bridges diffusion and flow matching models through innovative alignment and reparameterization techniques.
It employs trajectory alignment and Low-Rank Adaptation (LoRA) to reduce computational overhead while maintaining high model performance.
Experimental results demonstrate competitive performance in tasks like text-to-image synthesis and monocular depth estimation with faster inference speeds.

Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment

The paper "Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment" presents a novel framework that aims to bridge the gap between diffusion models and flow matching models. Diffusion models have been widely recognized for their ability to produce high-fidelity generative outputs, yet they often require extensive computational resources, especially during finetuning. Flow matching models, on the other hand, offer faster inference and improved efficiency but are also faced with computational challenges due to their large size.

Overview of Diff2Flow Framework

Diff2Flow addresses the complexities of transferring knowledge from diffusion models to flow matching models. By proposing a systematic alignment approach, the framework enables the efficient finetuning and knowledge transfer from pre-trained diffusion models to flow matching models, leveraging an innovative method that aligns timesteps, interpolants, and velocity fields.

Key Methodologies

The paper introduces several key methodologies within the Diff2Flow framework:

Trajectory Alignment: It presents an innovative way to mathematically align the trajectories of diffusion and flow matching models, ensuring that the transition between the two is seamless and computationally efficient. This is achieved through reparameterizing the timesteps and interpolating between the domains of diffusion and flow matching models.
Objective Change: The framework reconfigures the parameterization schemes of diffusion models to become compatible with flow matching objectives. This transformation facilitates the prediction of the velocity needed for flow matching models, thereby improving efficiency and reducing computational overhead.
Parameter-efficient Finetuning: Using techniques such as Low-Rank Adaptation (LoRA), Diff2Flow greatly reduces the required computational resources for model training while maintaining high performance. This approach is particularly beneficial in resource-constrained environments.

Experimental Results and Implications

Throughout the paper, Diff2Flow demonstrates its capability across a range of tasks, including text-to-image synthesis and monocular depth estimation. Notably, it achieves superior or competitive performance compared to existing methods, even under strict computational constraints. The framework supports reflow, straightening sampling trajectories, and enabling faster inference speeds with fewer sampling steps. In-depth experiments show the successful adaptation of diffusion priors for different tasks while minimizing finetuning overhead.

The implications of this research are considerable, paving the way for more efficient generative models that retain the strengths of diffusion models but incorporate the inferential advantages of flow matching models. The approach promises exciting developments in AI, particularly in domains requiring rapid generation of high-quality samples under computational constraints.

Conclusions and Future Directions

The authors effectively explore the intersection of diffusion and flow matching generative paradigms, contributing valuable insights that extend beyond the current capabilities of generative modeling. By providing a systematic framework for the alignment of these models, Diff2Flow sets a precedent for future research that seeks to optimize generative tasks while minimizing resource expenditure.

Looking forward, the exploration of more complex parameterizations, as well as the application of Diff2Flow to additional modalities (e.g., video, audio), represents promising future directions. The release of code from this paper will further catalyze research efforts in this domain, encouraging the development of more streamlined and efficient AI systems.

The results presented validate Diff2Flow's approach and highlight its potential as a foundational tool, not only for efficient generative tasks but also for a broader application of AI technologies where synthesis speed and quality are paramount.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Find Related Papers

Authors (4)

GitHub

GitHub - CompVis/diff2flow: [CVPR 2025] Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment (1 star)