MeanFlow for ODE-based Flow Matching
- The paper introduces MeanFlow's time-averaged velocity formulation as a principled alternative to instantaneous flow matching for exact one-step sample generation.
- It employs a rigorous mathematical identity linking mean and instantaneous velocities to guide neural training and minimize error accumulation.
- Empirical evaluations demonstrate significant improvements in FID and convergence rates compared to traditional ODE-based flow matching techniques.
MeanFlow is a framework for ODE-based generative modeling that replaces the traditional instantaneous velocity fields of flow matching with a principled formulation of time-averaged or mean velocity fields. This average-velocity viewpoint enables exact, one-step or few-step sample generation, fundamentally altering the cost and numerical stability properties of continuous normalizing flows. The MeanFlow approach is underpinned by a mathematically rigorous identity that connects the mean and instantaneous velocities, guiding network training and supporting both theoretical guarantees and empirical improvements over previous methods.
1. Principles and Mathematical Foundations
In conventional flow matching, one seeks a velocity field such that the ODE transports samples from a simple prior (e.g., Gaussian) to the data distribution. Training minimizes the squared error between a neural predictor and the ground truth trajectory derivatives sampled from straight-line interpolations between data and noise.
MeanFlow redefines the target from the instantaneous velocity to the average velocity over a time interval. Define the mean velocity field as
where is the state at time , are interval endpoints. One-step (1-NFE) generation is realized by sampling and applying
The crucial mathematical insight is the MeanFlow Identity:
where the derivatives are computed along the ODE trajectory and can be realized efficiently with Jacobian-vector products (JVPs). This identity provides an explicit regression target for training a neural network 0 to match the empirical average velocity without simulating full paths or requiring pretraining, distillation, or curriculum learning (Geng et al., 19 May 2025).
2. MeanFlow versus Flow Matching and Consistency
The original ODE-based flow matching applies supervision to the field at each instant, resulting in error accumulation when integrating the ODE—especially along curved trajectories. By contrast, MeanFlow targets the time-integral of the velocity and thus collapses the multi-step error to a single, globally consistent update.
The training objective, derived from the MeanFlow Identity, is
1
where 2 is computed as described above and 3 are as per the chain rule and JVP. This framework yields provable equivalence between zero loss and the exact ODE integral, ensuring valid transport.
Empirical evaluations show that prior one-step flow/diffusion models achieve FIDs of 34.2 (iCT), 10.6 (Shortcut), and 7.8 (IMM–1 step with 2-NFE guidance) on ImageNet 256². MeanFlow-XL/2 attains 3.43 in 1 NFE and 2.20 in 2 NFEs—surpassing previous state-of-the-art and narrowing the gap with slow multi-step samplers (DiT, SiT) (Geng et al., 19 May 2025).
3. Generalizations, Loss Decompositions, and Optimization
MeanFlow's objective naturally decomposes into a sum of a trajectory flow-matching term and a trajectory consistency term:
- Trajectory-flow-matching: enforces mean velocity agreement.
- Trajectory-consistency: regularizes for invariance to time shifts.
Gradient measurements reveal strong, persistent negative correlation between these two components (cosine similarity −0.4 to −0.6), leading to optimization conflict and slow convergence (Zhang et al., 23 Oct 2025). This phenomenon motivated generalized families such as 4-Flow, which interpolate between pure flow matching and MeanFlow objectives via a scheduling parameter 5.
6-Flow adopts a curriculum annealing schedule, starting from standard flow matching and smoothly shifting to MeanFlow supervision as 7 goes from 1 to 0. This resolves optimization conflict and yields consistent improvements in convergence and sample quality, outperforming pure MeanFlow across tested backbones and datasets (e.g., FID 2.58 on ImageNet 256x256 in 1 NFE for 8-Flow-XL/2+) (Zhang et al., 23 Oct 2025).
4. Algorithmic Realizations and Practical Techniques
The practical realization of MeanFlow centers on the neural approximation of mean velocity fields, with training relying on automatic differentiation and stop-gradient constructs to match the MeanFlow identity. Key algorithmic features:
- 1-NFE sampling: 9 directly transforms noise to data.
- Few-step sampling: recursive application over a user-defined partition, 0.
- Classifier-free guidance: seamlessly incorporated by defining a convex combination of guided and unguided instantaneous fields, propagating through the MeanFlow identity with no additional function evaluation cost.
Extensions and optimizations for MeanFlow include:
- Rectified MeanFlow: First applies one-step rectification to reduce trajectory curvature, followed by MeanFlow training on the straightened path, accelerating convergence and improving sample quality. Truncation heuristics (removing the most curved cases) can further stabilize training (Zhang et al., 28 Nov 2025).
- Modular MeanFlow: Introduces a spectrum of loss functions via gradient modulation (SG1), interpolating between first- and second-order models, and supports curriculum-style schedules for stable-to-expressive training (You et al., 24 Aug 2025).
- Efficient compositions: Methods such as COSE leverage analytic velocity composition identities to eliminate JVPs, reducing training cost and memory by up to 40% and sampling computations by up to 5× (Yang et al., 19 Sep 2025).
5. Theoretical Unification and Variations
MeanFlow is theoretically unified with the emerging field of transition flow matching (TFM), which directly parameterizes the expectation of state transitions 2 and enforces a similar differential identity. When 3, TFM reduces to MeanFlow via 4, and both frameworks support arbitrary time grids and one-step generative mapping (Ma, 16 Mar 2026).
Further generalizations expand the MeanFlow approach:
- Joint discrete-continuous MeanFlow: Allows SE(3)-equivariant molecular graph modeling by synchronizing discrete (topological) and continuous (geometric) flows under a unified time-bridge. The discrete head learns average rate matrices for CTMCs, supporting few-step inference while enforcing physical consistency (Xu et al., 9 Apr 2026).
- Riemannian MeanFlow: Extends mean velocity formulation to manifold-valued data, using intrinsic parallel transport and log-map representations to define and regress average velocities. Specialized multitask optimization (PCGrad) is used to resolve the strong optimization conflicts induced by term-wise loss decomposition (Zhong et al., 11 Mar 2026).
6. Empirical Performance and Limitations
Systematic benchmarking demonstrates MeanFlow’s empirical superiority in efficient sample quality, especially in high-dimensional, image-based settings. For example, on ImageNet 256x256, state-of-the-art one-step FIDs for flow/diffusion models prior to MeanFlow ranged from 34.2 to 7.8; MeanFlow-XL/2 achieved 3.43, and 5-Flow reduced this further to 2.58 (Geng et al., 19 May 2025, Zhang et al., 23 Oct 2025).
Rectified MeanFlow and Modular MeanFlow consistently improve convergence rates (2–10× GPU-hours) compared to both pure rectified-flow and pure MeanFlow, while offering stability and adaptability across data regimes (e.g., low-data, out-of-distribution) (Zhang et al., 28 Nov 2025, You et al., 24 Aug 2025).
However, MeanFlow and its extensions can encounter slow or unstable optimization on highly curved trajectories due to gradient conflicts, necessitating hybrid or curriculum-based training strategies (Zhang et al., 23 Oct 2025, You et al., 24 Aug 2025).
7. Connections and Future Directions
MeanFlow provides a unifying principle that bridges ODE-based flow matching, consistency models, shortcut models, and newly proposed transition flow formulations. Its average-velocity approach enables rigorous 1-NFE sampling, reduces numerical error accumulation, and scales naturally to discrete, Riemannian, and joint-mode generative tasks.
Ongoing research directions include the refinement of conflict-resolving objectives (e.g., multitask and curriculum strategies), systematic applications to structured data domains (e.g., molecular graphs), and the development of efficient approximations (e.g., via analytic velocity composition or JVP-eliminating techniques).
The explicit link between local and global velocity fields in MeanFlow opens new avenues for stability, scalability, and interpretability in simulation-free generative modeling (Geng et al., 19 May 2025, Zhang et al., 23 Oct 2025, Zhang et al., 28 Nov 2025, Xu et al., 9 Apr 2026, You et al., 24 Aug 2025, Yang et al., 19 Sep 2025, Ma, 16 Mar 2026, Zhong et al., 11 Mar 2026).