Papers
Topics
Authors
Recent
Search
2000 character limit reached

Local Flow Matching (LFM)

Updated 17 April 2026
  • Local Flow Matching (LFM) is a generative modeling framework that constructs an invertible mapping from simple Gaussian noise to complex data distributions using sequential local transformations.
  • It decomposes the global transformation into smaller sub-models trained via a simulation-free L² regression loss over short intervals, ensuring efficient density estimation.
  • Empirical evaluations on tabular, image, and policy tasks, along with theoretical divergence guarantees, demonstrate LFM’s superior convergence and performance.

Local Flow Matching (LFM) is a generative modeling framework for density estimation that incrementally constructs an invertible mapping from a simple prior (typically Gaussian noise) to a complex data distribution. LFM achieves this by decomposing the global transformation into a sequence of local flow-matching sub-models, each learned with a simulation-free, L2L^2 regression loss over short intervals in data-to-noise space. This modular approach enables smaller model sizes per block, faster convergence, and provides theoretical guarantees on the χ2\chi^2-divergence—and consequently the KL and total variation distances—between generated and true data distributions. Empirical results demonstrate that LFM matches or exceeds the performance of previous flow matching techniques in sample quality and training efficiency across tabular, image, and policy-learning tasks (Xu et al., 2024).

1. Problem Formulation and Standard Flow Matching

Given i.i.d. samples from an unknown data distribution pdata(x)p_{\rm data}(x) on Rd\mathbb{R}^d, the generative modeling task is to estimate a continuous, invertible map

f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d

such that for zpz(z)z \sim p_z(z), where pz(z)p_z(z) is typically the standard Gaussian N(0,Id)\mathcal{N}(0, I_d), the transformed variable x=f(z)x = f(z) approximates the data distribution: xpdatax \approx p_{\rm data}. Alternatively, χ2\chi^20 is interpreted as the solution map at time χ2\chi^21 for the ODE:

χ2\chi^22

with χ2\chi^23 a vector field to be learned.

In standard Flow Matching (FM), one matches the data and prior distributions in a single step by minimizing

χ2\chi^24

where χ2\chi^25, χ2\chi^26, and χ2\chi^27 is an analytic interpolation path (e.g., straight line or trigonometric). The solution χ2\chi^28 aligns the velocity field of the model with the analytically defined flow between endpoints. No SDE simulation or continuous-time score matching is required.

2. Local Flow Matching (LFM) Framework

LFM refines the standard FM approach by partitioning χ2\chi^29 into pdata(x)p_{\rm data}(x)0 subintervals: pdata(x)p_{\rm data}(x)1. Each subinterval pdata(x)p_{\rm data}(x)2 involves matching distributions that differ only by a small Ornstein–Uhlenbeck (OU) evolution, making the local tasks easier.

Formally, for block pdata(x)p_{\rm data}(x)3:

  • pdata(x)p_{\rm data}(x)4 is the push-forward of the original data through the first pdata(x)p_{\rm data}(x)5 sub-flows.
  • pdata(x)p_{\rm data}(x)6 is the marginal at pdata(x)p_{\rm data}(x)7 of the OU process started from pdata(x)p_{\rm data}(x)8 at pdata(x)p_{\rm data}(x)9.

A sub-model Rd\mathbb{R}^d0 is trained via the local FM loss,

Rd\mathbb{R}^d1

where Rd\mathbb{R}^d2 denotes the interpolation path for block Rd\mathbb{R}^d3.

3. Training and Sampling Algorithms

LFM proceeds by incrementally composing the learned sub-models. Each flow block defines an (approximately) invertible mapping:

Rd\mathbb{R}^d4

The full map Rd\mathbb{R}^d5 transports Rd\mathbb{R}^d6 to Rd\mathbb{R}^d7. Sampling from the generative model inverts these blocks sequentially.

Algorithm 1: Training LFM

  1. For Rd\mathbb{R}^d8 to Rd\mathbb{R}^d9:
    1. Sample f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d0.
    2. Sample f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d1.
    3. Minimize f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d2 via SGD.
    4. Push forward all training samples via f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d3 to form f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d4.

Algorithm 2: Sampling with LFM

  1. Draw f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d5.
  2. For f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d6 down to f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d7, set f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d8.
  3. Return f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d9.

At every stage, the process only requires i.i.d. samples from the current distribution and the OU kernel, ensuring strictly simulation-free and regression-based training.

4. Theoretical Guarantees

Denote zpz(z)z \sim p_z(z)0 as the density after composing zpz(z)z \sim p_z(z)1 blocks, with zpz(z)z \sim p_z(z)2. If each block achieves population error zpz(z)z \sim p_z(z)3 over its interval, i.e.,

zpz(z)z \sim p_z(z)4

and mild regularity holds (Gaussian tails, bounded scores), then by induction and the OU contraction one obtains

zpz(z)z \sim p_z(z)5

Summing over zpz(z)z \sim p_z(z)6 blocks and neglecting vanishing exponential terms,

zpz(z)z \sim p_z(z)7

Invertibility and the data-processing inequality for zpz(z)z \sim p_z(z)8-divergences imply the same bound holds in the reverse direction for the generated output. Furthermore,

zpz(z)z \sim p_z(z)9

so that KL and TV distances are likewise controlled (Xu et al., 2024).

5. Empirical Evaluation

Reported results cover tabular data, 2D toy distributions, image synthesis, and robotic manipulation:

  • Tabular Data: On UCI benchmarks of various dimension (pz(z)p_z(z)0), LFM achieves test negative log-likelihood (NLL) among the top two methods throughout; for MINIBOONE (pz(z)p_z(z)1), LFM NLL~9.95 is essentially tied with the strongest baselines (NLL pz(z)p_z(z)2–pz(z)p_z(z)3).
  • Toy Distributions: On 2D "tree" and "rose" benchmarks, LFM attains marginally better NLL (2.24 vs. 2.35), visually accurately capturing fine structures.
  • Unconditional Image Generation: On CIFAR-10 and ImageNet-32, with the same UNet configuration, LFM achieves FID~8.45 (vs. 10.27) for CIFAR and 7.00 (vs. 8.49) for ImageNet-32, training with roughly pz(z)p_z(z)4 the steps of InterFlow. For Flowers 128pz(z)p_z(z)5128, post-distillation LFM attains FID~71.0 (vs. InterFlow's~80.0).
  • Robotics: On the Robomimic benchmark (five tasks), LFM matches or slightly outperforms global FM for final success rates, reaching higher early-epoch success (e.g., on "Transport" 0.75 @ 200 epochs vs. 0.60).

6. Implementation and Practical Aspects

The LFM framework supports architectural and training optimizations:

  • Parameter Efficiency: Per-block models can be much smaller thanks to the local character of subproblems; typical UNets total pz(z)p_z(z)6200M parameters, distributed across pz(z)p_z(z)7 blocks.
  • Training Efficiency: Training time scales linearly with the number of blocks, but convergence per block is faster due to reduced subproblem complexity. For instance, CIFAR-10 with pz(z)p_z(z)8 uses ~50,000 batches versus 500,000 for InterFlow.
  • Hyperparameter Choices: Step sizes pz(z)p_z(z)9 may be uniform or geometric (e.g., N(0,Id)\mathcal{N}(0, I_d)0), typically with N(0,Id)\mathcal{N}(0, I_d)1–N(0,Id)\mathcal{N}(0, I_d)2; Adam is used with LR N(0,Id)\mathcal{N}(0, I_d)3–N(0,Id)\mathcal{N}(0, I_d)4, batch sizes of 512–1024.
  • Block Distillation: The N(0,Id)\mathcal{N}(0, I_d)5-block sequence can be distilled into a smaller number of blocks (N(0,Id)\mathcal{N}(0, I_d)6) through least-squares regression on block maps, enabling further efficiency gains (cf. Liu et al. 2023).

7. Strengths, Limitations, and Extensions

LFM offers strengths including simulation-free end-to-end training using only N(0,Id)\mathcal{N}(0, I_d)7 losses, modular structure for efficient parameter use and convergence, and proven N(0,Id)\mathcal{N}(0, I_d)8 (hence, KL and TV) divergence guarantees.

However, LFM assumes the capability to sample exactly from OU kernels, and, in theory, does not account for numerical ODE solver error; approximate kernels may be needed in high dimensions. Potential extensions include weight-sharing for temporal continuity, adaptive step-sizing, mixing score-based blocks for richer local dynamics, and refinement of N(0,Id)\mathcal{N}(0, I_d)9-divergence bounds for tighter guarantees.

The approach decomposes the difficult global flow-matching challenge into local problems, each solvable via plain regression, and then stitches the solutions invertibly, providing competitive or superior generative performance with clean convergence bounds (Xu et al., 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Local Flow Matching (LFM).