Papers
Topics
Authors
Recent
Search
2000 character limit reached

Flow Matching Guide and Code

Published 9 Dec 2024 in cs.LG | (2412.06264v1)

Abstract: Flow Matching (FM) is a recent framework for generative modeling that has achieved state-of-the-art performance across various domains, including image, video, audio, speech, and biological structures. This guide offers a comprehensive and self-contained review of FM, covering its mathematical foundations, design choices, and extensions. By also providing a PyTorch package featuring relevant examples (e.g., image and text generation), this work aims to serve as a resource for both novice and experienced researchers interested in understanding, applying and further developing FM.

Citations (1)

Summary

  • The paper introduces a comprehensive Flow Matching framework that trains generative models by learning time-dependent velocity fields to map between probability distributions.
  • It proposes the Conditional Flow Matching loss, leveraging conditional expectations and Bregman divergences, to enable practical and tractable training in PyTorch.
  • The work explores extensions to conditional generation, affine flows linked to optimal transport, and adaptations to Riemannian manifolds, enhancing sample quality and computational efficiency.

This paper introduces a comprehensive guide and accompanying PyTorch package for Flow Matching (FM), a generative modeling framework achieving SOTA performance across various domains. The work aims to provide a self-contained review of FM, covering its mathematical foundations, design choices, and extensions, while also enabling newcomers to quickly adopt and build upon FM for their own applications.

The paper starts by reviewing the mathematical background, introducing concepts such as random vectors, conditional densities and expectations, diffeomorphisms, and push-forward maps. It then defines flows as time-dependent mappings and discusses their equivalence to velocity fields through Ordinary Differential Equations (ODEs). The key result here is that a CrC^r flow is uniquely defined by a CrC^r velocity field, and vice versa. Numerical methods for solving ODEs, such as the Euler method and the midpoint method, are introduced as ways to compute target samples from source samples. The concept of probability paths and the Continuity Equation are discussed, linking velocity fields and probability paths. The Instantaneous Change of Variables formula is presented, which enables tractable computation of exact likelihoods for flow models. Finally, the section concludes with training flow models with simulation, highlighting the computational burdens that Flow Matching aims to alleviate.

The paper describes the FM framework as a method for training a flow model by solving the Flow Matching Problem: finding a velocity field utθu^\theta_t that generates a probability path ptp_t from a source distribution pp to a target distribution qq. The method involves designing a probability path ptp_t, learning a velocity field utθu^\theta_t to generate ptp_t, and sampling from the learned model by solving an ODE with utθu^\theta_t. The FM loss function minimizes the difference between the target velocity field CrC^r0 and the learned velocity field CrC^r1.

The paper introduces the concept of conditional probability paths CrC^r2 and conditional velocity fields CrC^r3, where CrC^r4 is an arbitrary random variable. The marginal probability path CrC^r5 is then constructed by integrating the conditional probability paths over CrC^r6, and the marginal velocity field CrC^r7 is defined as the conditional expectation of CrC^r8 given CrC^r9. The Marginalization Trick is presented, which states that if utθu^\theta_t0 generates utθu^\theta_t1, then the marginal velocity field utθu^\theta_t2 generates the marginal probability path utθu^\theta_t3 under certain regularity conditions.

To address the intractability of computing the target velocity utθu^\theta_t4, the paper introduces the Conditional Flow Matching (CFM) loss, which replaces utθu^\theta_t5 with the conditional velocity utθu^\theta_t6 in the loss function. It is shown that the gradients of the FM and CFM losses coincide, making the CFM loss a practical alternative for training. The paper highlights that this result is a particular instance of a more general result utilizing Bregman divergences for learning conditional expectations.

The paper describes how conditional generation can be achieved with conditional flows, where a conditional flow model utθu^\theta_t7 is defined with a conditional flow utθu^\theta_t8 satisfying certain boundary conditions. The conditional probability path utθu^\theta_t9 is then obtained by pushing forward the source distribution through ptp_t0, and the conditional velocity field ptp_t1 is derived from ptp_t2.

The paper discusses different conditioning choices such as target samples (ptp_t3), source samples (ptp_t4), or two-sided (ptp_t5) and shows, that when the conditional flows are a diffeomorphism, all constructions are equivalent. It provides a construction to build such a path by considering an interpolant that satisfies certain conditions.

The paper explores the connection to Optimal Transport (OT) and introduces the linear conditional flow ptp_t6 as a minimizer of a bound on the Kinetic Energy. The linear conditional flow is a special case of affine conditional flows ptp_t7, where ptp_t8 and ptp_t9 are scheduler functions. It is shown that for affine flows with an independent coupling and a smooth, strictly positive source density, the marginal velocity field generates a probability path interpolating between the source and target distributions. The paper explores velocity parameterizations, pp0-prediction and pp1-prediction, and derives conversion formulas between these parameterizations. It is also shown how an affine conditional flow model trained with a specific scheduler can be adapted to a different scheduler post-training.

The paper discusses Gaussian paths, which are a popular choice for affine probability paths, and derives the score function for the conditional path. It also explores data couplings, including paired data and multisample couplings. For paired data, it is proposed to learn a bridge or flow model with data-dependent couplings, where the joint distribution of source and target samples is constructed based on the reverse dependency pp2. For multisample couplings, it describes how to construct non-trivial joints between source and target distributions to reduce the transport cost and induce straight trajectories.

The paper discusses conditional generation and guidance techniques. The goal is to train a generative model under a guiding signal to further control the produced samples. It presents conditional models, where the model learns to sample from the conditional distribution pp3, where pp4 is a label or guidance variable. It also discusses classifier guidance, where an unconditional model is guided by a time-dependent classifier, and classifier-free guidance, where the conditional and unconditional scores are learned simultaneously using the same model.

Finally, the paper extends Flow Matching to Riemannian manifolds. The goal is to generalize the FM framework to non-Euclidean spaces, which are useful for modeling various types of data.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 24 tweets with 1423 likes about this paper.

HackerNews

  1. Flow Matching Guide and Code (3 points, 0 comments)