This paper introduces a comprehensive guide and accompanying PyTorch package for Flow Matching (FM), a generative modeling framework achieving SOTA performance across various domains. The work aims to provide a self-contained review of FM, covering its mathematical foundations, design choices, and extensions, while also enabling newcomers to quickly adopt and build upon FM for their own applications.
The paper starts by reviewing the mathematical background, introducing concepts such as random vectors, conditional densities and expectations, diffeomorphisms, and push-forward maps. It then defines flows as time-dependent mappings and discusses their equivalence to velocity fields through Ordinary Differential Equations (ODEs). The key result here is that a Cr flow is uniquely defined by a Cr velocity field, and vice versa. Numerical methods for solving ODEs, such as the Euler method and the midpoint method, are introduced as ways to compute target samples from source samples. The concept of probability paths and the Continuity Equation are discussed, linking velocity fields and probability paths. The Instantaneous Change of Variables formula is presented, which enables tractable computation of exact likelihoods for flow models. Finally, the section concludes with training flow models with simulation, highlighting the computational burdens that Flow Matching aims to alleviate.
The paper describes the FM framework as a method for training a flow model by solving the Flow Matching Problem: finding a velocity field utθ that generates a probability path pt from a source distribution p to a target distribution q. The method involves designing a probability path pt, learning a velocity field utθ to generate pt, and sampling from the learned model by solving an ODE with utθ. The FM loss function minimizes the difference between the target velocity field ut and the learned velocity field utθ.
The paper introduces the concept of conditional probability paths pt∣Z(x∣z) and conditional velocity fields ut(x∣z), where Z is an arbitrary random variable. The marginal probability path pt(x) is then constructed by integrating the conditional probability paths over Z, and the marginal velocity field ut(x) is defined as the conditional expectation of ut(Xt∣Z) given Xt=x. The Marginalization Trick is presented, which states that if ut(x∣z) generates pt(x∣z), then the marginal velocity field ut(x) generates the marginal probability path pt(x) under certain regularity conditions.
To address the intractability of computing the target velocity ut, the paper introduces the Conditional Flow Matching (CFM) loss, which replaces ut(x) with the conditional velocity ut(x∣Z) in the loss function. It is shown that the gradients of the FM and CFM losses coincide, making the CFM loss a practical alternative for training. The paper highlights that this result is a particular instance of a more general result utilizing Bregman divergences for learning conditional expectations.
The paper describes how conditional generation can be achieved with conditional flows, where a conditional flow model Xt∣1=ψt(X0∣x1) is defined with a conditional flow ψt satisfying certain boundary conditions. The conditional probability path pt∣1(x∣x1) is then obtained by pushing forward the source distribution through ψt, and the conditional velocity field ut(x∣x1) is derived from ψt.
The paper discusses different conditioning choices such as target samples (Z=X1), source samples (Z=X0), or two-sided (Z=(X0,X1)) and shows, that when the conditional flows are a diffeomorphism, all constructions are equivalent. It provides a construction to build such a path by considering an interpolant that satisfies certain conditions.
The paper explores the connection to Optimal Transport (OT) and introduces the linear conditional flow ψt(x∣x1)=tx1+(1−t)x as a minimizer of a bound on the Kinetic Energy. The linear conditional flow is a special case of affine conditional flows ψt(x∣x1)=αtx1+σtx, where αt and σt are scheduler functions. It is shown that for affine flows with an independent coupling and a smooth, strictly positive source density, the marginal velocity field generates a probability path interpolating between the source and target distributions. The paper explores velocity parameterizations, x1-prediction and x0-prediction, and derives conversion formulas between these parameterizations. It is also shown how an affine conditional flow model trained with a specific scheduler can be adapted to a different scheduler post-training.
The paper discusses Gaussian paths, which are a popular choice for affine probability paths, and derives the score function for the conditional path. It also explores data couplings, including paired data and multisample couplings. For paired data, it is proposed to learn a bridge or flow model with data-dependent couplings, where the joint distribution of source and target samples is constructed based on the reverse dependency π0∣1(x0∣x1). For multisample couplings, it describes how to construct non-trivial joints between source and target distributions to reduce the transport cost and induce straight trajectories.
The paper discusses conditional generation and guidance techniques. The goal is to train a generative model under a guiding signal to further control the produced samples. It presents conditional models, where the model learns to sample from the conditional distribution q(x1∣y), where y is a label or guidance variable. It also discusses classifier guidance, where an unconditional model is guided by a time-dependent classifier, and classifier-free guidance, where the conditional and unconditional scores are learned simultaneously using the same model.
Finally, the paper extends Flow Matching to Riemannian manifolds. The goal is to generalize the FM framework to non-Euclidean spaces, which are useful for modeling various types of data.