On the Wasserstein Convergence and Straightness of Rectified Flow (2410.14949v6)
Abstract: Diffusion models have emerged as a powerful tool for image generation and denoising. Typically, generative models learn a trajectory between the starting noise distribution and the target data distribution. Recently Liu et al. (2023b) proposed Rectified Flow (RF), a generative model that aims to learn straight flow trajectories from noise to data using a sequence of convex optimization problems with close ties to optimal transport. If the trajectory is curved, one must use many Euler discretization steps or novel strategies, such as exponential integrators, to achieve a satisfactory generation quality. In contrast, RF has been shown to theoretically straighten the trajectory through successive rectifications, reducing the number of function evaluations (NFEs) while sampling. It has also been shown empirically that RF may improve the straightness in two rectifications if one can solve the underlying optimization problem within a sufficiently small error. In this paper, we make two contributions. First, we provide a theoretical analysis of the Wasserstein distance between the sampling distribution of RF and the target distribution. Our error rate is characterized by the number of discretization steps and a novel formulation of straightness stronger than that in the original work. Secondly, we present general conditions guaranteeing uniqueness and straightness of 1-RF, which is in line with previous empirical findings. As a byproduct of our analysis, we show that, in one dimension, RF started at the standard Gaussian distribution yields the Monge map. Additionally, we also present empirical results on both simulated and real datasets to validate our theoretical findings. The code is available at https://github.com/bansal-vansh/rectified-flow.