Local Flow Matching (LFM)

Updated 17 April 2026

Local Flow Matching (LFM) is a generative modeling framework that constructs an invertible mapping from simple Gaussian noise to complex data distributions using sequential local transformations.
It decomposes the global transformation into smaller sub-models trained via a simulation-free L² regression loss over short intervals, ensuring efficient density estimation.
Empirical evaluations on tabular, image, and policy tasks, along with theoretical divergence guarantees, demonstrate LFM’s superior convergence and performance.

Local Flow Matching (LFM) is a generative modeling framework for density estimation that incrementally constructs an invertible mapping from a simple prior (typically Gaussian noise) to a complex data distribution. LFM achieves this by decomposing the global transformation into a sequence of local flow-matching sub-models, each learned with a simulation-free, $L^2$ regression loss over short intervals in data-to-noise space. This modular approach enables smaller model sizes per block, faster convergence, and provides theoretical guarantees on the $\chi^2$ -divergence—and consequently the KL and total variation distances—between generated and true data distributions. Empirical results demonstrate that LFM matches or exceeds the performance of previous flow matching techniques in sample quality and training efficiency across tabular, image, and policy-learning tasks (Xu et al., 2024).

1. Problem Formulation and Standard Flow Matching

Given i.i.d. samples from an unknown data distribution $p_{\rm data}(x)$ on $\mathbb{R}^d$ , the generative modeling task is to estimate a continuous, invertible map

$f: \mathbb{R}^d \to \mathbb{R}^d$

such that for $z \sim p_z(z)$ , where $p_z(z)$ is typically the standard Gaussian $\mathcal{N}(0, I_d)$ , the transformed variable $x = f(z)$ approximates the data distribution: $x \approx p_{\rm data}$ . Alternatively, $\chi^2$ 0 is interpreted as the solution map at time $\chi^2$ 1 for the ODE:

$\chi^2$ 2

with $\chi^2$ 3 a vector field to be learned.

In standard Flow Matching (FM), one matches the data and prior distributions in a single step by minimizing

$\chi^2$ 4

where $\chi^2$ 5, $\chi^2$ 6, and $\chi^2$ 7 is an analytic interpolation path (e.g., straight line or trigonometric). The solution $\chi^2$ 8 aligns the velocity field of the model with the analytically defined flow between endpoints. No SDE simulation or continuous-time score matching is required.

2. Local Flow Matching (LFM) Framework

LFM refines the standard FM approach by partitioning $\chi^2$ 9 into $p_{\rm data}(x)$ 0 subintervals: $p_{\rm data}(x)$ 1. Each subinterval $p_{\rm data}(x)$ 2 involves matching distributions that differ only by a small Ornstein–Uhlenbeck (OU) evolution, making the local tasks easier.

Formally, for block $p_{\rm data}(x)$ 3:

$p_{\rm data}(x)$ 4 is the push-forward of the original data through the first $p_{\rm data}(x)$ 5 sub-flows.
$p_{\rm data}(x)$ 6 is the marginal at $p_{\rm data}(x)$ 7 of the OU process started from $p_{\rm data}(x)$ 8 at $p_{\rm data}(x)$ 9.

A sub-model $\mathbb{R}^d$ 0 is trained via the local FM loss,

$\mathbb{R}^d$ 1

where $\mathbb{R}^d$ 2 denotes the interpolation path for block $\mathbb{R}^d$ 3.

3. Training and Sampling Algorithms

LFM proceeds by incrementally composing the learned sub-models. Each flow block defines an (approximately) invertible mapping:

$\mathbb{R}^d$ 4

The full map $\mathbb{R}^d$ 5 transports $\mathbb{R}^d$ 6 to $\mathbb{R}^d$ 7. Sampling from the generative model inverts these blocks sequentially.

Algorithm 1: Training LFM

For $\mathbb{R}^d$ $R^{d}$ 8 to $\mathbb{R}^d$ $R^{d}$ 9:
1. Sample $f: \mathbb{R}^d \to \mathbb{R}^d$ 0.
2. Sample $f: \mathbb{R}^d \to \mathbb{R}^d$ 1.
3. Minimize $f: \mathbb{R}^d \to \mathbb{R}^d$ 2 via SGD.
4. Push forward all training samples via $f: \mathbb{R}^d \to \mathbb{R}^d$ 3 to form $f: \mathbb{R}^d \to \mathbb{R}^d$ 4.

Algorithm 2: Sampling with LFM

Draw $f: \mathbb{R}^d \to \mathbb{R}^d$ 5.
For $f: \mathbb{R}^d \to \mathbb{R}^d$ 6 down to $f: \mathbb{R}^d \to \mathbb{R}^d$ 7, set $f: \mathbb{R}^d \to \mathbb{R}^d$ 8.
Return $f: \mathbb{R}^d \to \mathbb{R}^d$ 9.

At every stage, the process only requires i.i.d. samples from the current distribution and the OU kernel, ensuring strictly simulation-free and regression-based training.

4. Theoretical Guarantees

Denote $z \sim p_z(z)$ 0 as the density after composing $z \sim p_z(z)$ 1 blocks, with $z \sim p_z(z)$ 2. If each block achieves population error $z \sim p_z(z)$ 3 over its interval, i.e.,

$z \sim p_z(z)$ 4

and mild regularity holds (Gaussian tails, bounded scores), then by induction and the OU contraction one obtains

$z \sim p_z(z)$ 5

Summing over $z \sim p_z(z)$ 6 blocks and neglecting vanishing exponential terms,

$z \sim p_z(z)$ 7

Invertibility and the data-processing inequality for $z \sim p_z(z)$ 8-divergences imply the same bound holds in the reverse direction for the generated output. Furthermore,

$z \sim p_z(z)$ 9

so that KL and TV distances are likewise controlled (Xu et al., 2024).

5. Empirical Evaluation

Reported results cover tabular data, 2D toy distributions, image synthesis, and robotic manipulation:

Tabular Data: On UCI benchmarks of various dimension ( $p_z(z)$ 0), LFM achieves test negative log-likelihood (NLL) among the top two methods throughout; for MINIBOONE ( $p_z(z)$ 1), LFM NLL~9.95 is essentially tied with the strongest baselines (NLL $p_z(z)$ 2– $p_z(z)$ 3).
Toy Distributions: On 2D "tree" and "rose" benchmarks, LFM attains marginally better NLL (2.24 vs. 2.35), visually accurately capturing fine structures.
Unconditional Image Generation: On CIFAR-10 and ImageNet-32, with the same UNet configuration, LFM achieves FID~8.45 (vs. 10.27) for CIFAR and 7.00 (vs. 8.49) for ImageNet-32, training with roughly $p_z(z)$ 4 the steps of InterFlow. For Flowers 128 $p_z(z)$ 5128, post-distillation LFM attains FID~71.0 (vs. InterFlow's~80.0).
Robotics: On the Robomimic benchmark (five tasks), LFM matches or slightly outperforms global FM for final success rates, reaching higher early-epoch success (e.g., on "Transport" 0.75 @ 200 epochs vs. 0.60).

6. Implementation and Practical Aspects

The LFM framework supports architectural and training optimizations:

Parameter Efficiency: Per-block models can be much smaller thanks to the local character of subproblems; typical UNets total $p_z(z)$ 6200M parameters, distributed across $p_z(z)$ 7 blocks.
Training Efficiency: Training time scales linearly with the number of blocks, but convergence per block is faster due to reduced subproblem complexity. For instance, CIFAR-10 with $p_z(z)$ 8 uses ~50,000 batches versus 500,000 for InterFlow.
Hyperparameter Choices: Step sizes $p_z(z)$ 9 may be uniform or geometric (e.g., $\mathcal{N}(0, I_d)$ 0), typically with $\mathcal{N}(0, I_d)$ 1– $\mathcal{N}(0, I_d)$ 2; Adam is used with LR $\mathcal{N}(0, I_d)$ 3– $\mathcal{N}(0, I_d)$ 4, batch sizes of 512–1024.
Block Distillation: The $\mathcal{N}(0, I_d)$ 5-block sequence can be distilled into a smaller number of blocks ( $\mathcal{N}(0, I_d)$ 6) through least-squares regression on block maps, enabling further efficiency gains (cf. Liu et al. 2023).

7. Strengths, Limitations, and Extensions

LFM offers strengths including simulation-free end-to-end training using only $\mathcal{N}(0, I_d)$ 7 losses, modular structure for efficient parameter use and convergence, and proven $\mathcal{N}(0, I_d)$ 8 (hence, KL and TV) divergence guarantees.

However, LFM assumes the capability to sample exactly from OU kernels, and, in theory, does not account for numerical ODE solver error; approximate kernels may be needed in high dimensions. Potential extensions include weight-sharing for temporal continuity, adaptive step-sizing, mixing score-based blocks for richer local dynamics, and refinement of $\mathcal{N}(0, I_d)$ 9-divergence bounds for tighter guarantees.

The approach decomposes the difficult global flow-matching challenge into local problems, each solvable via plain regression, and then stitches the solutions invertibly, providing competitive or superior generative performance with clean convergence bounds (Xu et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

Local Flow Matching Generative Models (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Local Flow Matching (LFM).