Papers
Topics
Authors
Recent
Search
2000 character limit reached

Rectified Flow Models in Generative Modeling

Updated 23 January 2026
  • Rectified flow models are generative models that learn a time-dependent velocity field to transform a simple base distribution into a target data distribution via near-straight ODE trajectories.
  • They employ a regression approach to approximate data displacements, enabling high-quality synthesis with drastically fewer sampling steps compared to stochastic diffusion methods.
  • Advancements like recursive rectification and architecture-specific adaptations improve model fidelity and broaden applications across images, audio, text, and fluid simulations.

Rectified flow models are generative models that transport a simple source distribution (e.g., standard Gaussian) to a target data distribution (such as images, speech, or discrete text) by learning an ordinary differential equation (ODE) whose trajectories are as “straight” as possible in sample space. Unlike score-based diffusion models, which involve stochastic differential equations with hundreds of sampling steps, rectified flow learns a time-dependent velocity field regressed to straight-line displacements, enabling high-quality generation with drastically fewer steps. The rectified flow paradigm has seen rapid expansion and innovation, spanning continuous, discrete, hierarchical, and multi-modal domains, and has initiated new directions in optimal transport, model distillation, editing, and plug-and-play applications.

1. Mathematical Foundation and Core Objective

The core objective of rectified flow is to learn a velocity field vθ(x,t)v_\theta(x, t) that transforms a base distribution π0\pi_0 (often N(0,I)\mathcal{N}(0, I)) into a target distribution π1\pi_1 (e.g., images, audio). For paired samples (x0,x1)π0×π1(x_0, x_1) \sim \pi_0 \times \pi_1, the straight-line interpolation is: xt=(1t)x0+tx1,t[0,1].x_t = (1-t)x_0 + t x_1, \quad t \in [0, 1]. The model regresses vθv_\theta to approximate the displacement direction (x1x0)(x_1 - x_0) at each xtx_t: minθ01Ex0,x1(x1x0)vθ(xt,t)2dt.\min_\theta \int_0^1 \mathbb{E}_{x_0, x_1} \left\| (x_1 - x_0) - v_\theta(x_t, t) \right\|^2 dt. The generative ODE is: dZtdt=vθ(Zt,t),Z0π0.\frac{d Z_t}{dt} = v_\theta(Z_t, t), \quad Z_0 \sim \pi_0. If vθv_\theta matches (x1x0)(x_1 - x_0) everywhere, trajectories between Z0Z_0 and Z1Z_1 traced by the ODE are straight, and the final distribution of Z1Z_1 matches π1\pi_1 exactly (Liu et al., 2022, Guan et al., 2023).

Distinct from score-based diffusion, which models the (usually non-linear, stochastic) score field xlogpt(x)\nabla_x \log p_t(x), rectified flow directly fits the conditional expectation of the data displacement, resulting in deterministic, straight or nearly-straight paths.

2. Recursive Rectification and Trajectory Straightening

Early rectified flow models observed that a single regression (1-RF) typically yields trajectories that are close to straight but not exact. By recursively re-coupling synthetic pairs generated from the current flow and retraining (the "reflow" step), each successive rectification (2-RF, 3-RF, etc.) yields increasingly straight ODE paths:

  • After each reflow step, the conditional expectation velocity is updated using pairs from the previous flow's samples.
  • Theoretical results prove a monotonic decrease in convex transport costs (e.g., Wasserstein distance) and decreasing "straightness" error with each recursion (Liu et al., 2022, Bansal et al., 2024).
  • Empirical studies confirm that after one or two rectified steps, even a single-step Euler integration often suffices for high-fidelity generation (e.g., image FID, TTS MOS) (Guan et al., 2023, Liu et al., 2022).

The recursive process:

  • (Z0k,Z1k)(Z_0^{k}, Z_1^{k}) \gets rectify (Z0k1,Z1k1)(Z_0^{k-1}, Z_1^{k-1})
  • vθ(k+1)v_\theta^{(k+1)} regressed as above on the new pairs

This procedure underlies one-step distillation and efficient few-step models (Zhu et al., 2024).

3. Algorithmic and Architectural Details

The parameterization of vθ(x,t)v_\theta(x, t) varies by domain:

The typical training pseudocode:

  1. Sample (x0,x1)π0×π1(x_0, x_1) \sim \pi_0 \times \pi_1, tU[0,1]t \sim U[0, 1].
  2. Compute xt=(1t)x0+tx1x_t = (1-t)x_0 + t x_1.
  3. Output v^=vθ(xt,t)\hat{v} = v_\theta(x_t, t); minimize x1x0v^2\| x_1 - x_0 - \hat{v} \|^2.
  4. Update θ\theta.

For conditional generation (e.g., text-to-image, TTS), vθv_\theta receives additional conditioning inputs (CLIP, FastSpeech, language embeddings) (Guan et al., 2023, Ma et al., 12 Mar 2025).

4. Theoretical Guarantees, Optimal Transport, and Extensions

Rectified flow uniquely preserves marginals at every interpolation time, and, unlike extrinsic penalty-based OT solvers, maintains valid coupling throughout all recursive steps:

  • Marginal preservation: The ODE-sampled points have the correct time-marginals by construction, irrespective of neural network error (Liu, 2022).
  • Cost monotonicity: Any convex transport cost strictly decreases with each rectification; the iterative procedure converges to the optimal transport plan for c(x)c(x) under mild regularity.
  • Strong straightness: Successive rectification yields couplings where increments Z1Z0Z_1 - Z_0 are predictable functions of the chord tZ1+(1t)Z0t Z_1 + (1-t) Z_0, a stronger criterion than average curvature (Bansal et al., 2024).
  • Extensions: Hierarchical formulations model higher derivatives (velocity, acceleration), permitting intersecting ODE paths, further reducing trajectory curvature and number of function evaluations (Zhang et al., 24 Feb 2025).
  • Balanced/Conic flows: Recent approaches integrate real samples and geometric constraints (slerp-cones) to prevent error accumulation, sampling drift, or model collapse when relying heavily on synthetic pairs during recursive reflow (Seong et al., 29 Oct 2025, Zhu et al., 2024).

5. Empirical Applications and Benchmarks

Rectified flow models demonstrate state-of-the-art or near-SOTA performance in a wide array of generative modeling tasks:

  • Image Generation: Euler or RK solvers with as few as 1–8 steps suffice for high-fidelity image synthesis (e.g., NAMI, Flux, InstaFlow) (Ma et al., 12 Mar 2025, Liu et al., 2022, Yang et al., 2024). On CIFAR-10, one-step flows achieve FID<<5.1 at 15\sim 15M parameters (Zhu et al., 2024). Progressive and multiresolution architectures (as in NAMI) further accelerate inference (Ma et al., 12 Mar 2025).
  • Text-to-Speech: ReFlow-TTS matches or surpasses diffusion-based models with MOS\approx4.5 in <1%<1\% of the sampling time (Guan et al., 2023).
  • Fluid Simulation: Rectified flows for multiscale PDEs recover fine statistical/moment properties, require only 4–8 inference steps, and outperform or match SDE-based conditional flows (Armegioiu et al., 3 Jun 2025).
  • Discrete Data: Rectified discrete flows (ReDi) guarantee monotonic reduction in conditional total correlation of the modeled coupling, enabling one-step or few-step discrete data synthesis with large IS/FID improvements (Yoo et al., 21 Jul 2025).
  • Layout and Multimodal Generation: SLayR and JanusFlow prove that transformer-based rectified flows can be embedded into LLMs for layout or joint understanding/generation with competitive quality, parameter, and speed tradeoffs (Braunstein et al., 2024, Ma et al., 2024).
  • Editing and Control: Semantic attribute disentanglement in FluxSpace demonstrates editable latent representations within rectified flow transformers (Dalva et al., 2024). Plug-and-play prior, zero-shot editing, and inversion are enabled by deterministic, time-symmetric ODE properties (Yang et al., 2024, Patel et al., 2024).

6. Guidance, Constraints, and Sampling Enhancements

Classifier-free guidance (CFG), a mainstay in diffusion models, is not natively stable in rectified flow due to drift off the learned transport manifold. Recent advances propose geometry-aware, predictor-corrector schemes (Rectified-CFG++) that guarantee marginal proximity and bounded manifold deviation, enabling strong prompt adherence and artifact-free images across Flux, SD3/SD3.5, and Lumina (Saini et al., 9 Oct 2025). Boundary-enforced variants further impose analytic constraints on vθ(x,t)v_\theta(x, t) at t=0,1t=0, 1, eliminating endpoint bias and stabilizing ODE/SDE sampling (Hu et al., 18 Jun 2025). Steering via the vector field can be performed in a deterministic, gradient-skipping fashion, facilitating tasks like classifier guidance, inpainting, and inverse-problem solving in a unified, memory- and compute-efficient manner (Patel et al., 2024).

7. Open Challenges and Future Directions

Active research addresses several limitations and new directions:

  • Model collapse: Iterative reflow steps relying solely on synthetic pairs induce model collapse, analogous to DAE degeneration; interleaving real pairs or careful SDE reverse-simulation avoids rank-loss and sample diversity degradation (Zhu et al., 2024, Seong et al., 29 Oct 2025).
  • Data efficiency and real-sample anchoring: Balanced Conic Rectified Flow sharply reduces the need for millions of synthetic pairs and improves robustness along real-image reversals (Seong et al., 29 Oct 2025).
  • Scalability and compression: SlimFlow extends rectified flow to highly compressed, small-footprint models with one-step sampling, via annealing and flow-guided distillation (Zhu et al., 2024).
  • Hierarchical and multimodal flows: Hierarchical rectified flows further minimize the number of NFEs by modeling full velocity (and higher derivative) distributions, permitting path intersection (Zhang et al., 24 Feb 2025).
  • Connections to optimal transport: The c-rectified flow generalizes the methodology to directly solve Monge–Kantorovich OT for user-specified costs while preserving marginal constraints at every step (Liu, 2022).
  • Domain transfer, adaptation, and fine-grained editing: Applications in domain adaptation, fine-grained style transfer, and inversion-free semantic editing are ongoing (Dalva et al., 2024, Yang et al., 2024, Patel et al., 2024).

Future advances are likely to focus on improved solver-adaptive schemes, further analysis of straightness and marginality in high dimensions, enhanced semantic disentanglement, and expansion into multimodal, hierarchical, and plug-and-play generative tasks.


References: (Liu et al., 2022, Guan et al., 2023, Liu, 2022, Bansal et al., 2024, Zhu et al., 2024, Ma et al., 12 Mar 2025, Patel et al., 2024, Dalva et al., 2024, Braunstein et al., 2024, Armegioiu et al., 3 Jun 2025, Seong et al., 29 Oct 2025, Yoo et al., 21 Jul 2025, Zhang et al., 24 Feb 2025, Hu et al., 18 Jun 2025, Zhu et al., 2024, Li et al., 25 Nov 2025, Saini et al., 9 Oct 2025, Yang et al., 2024, Ma et al., 2024)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Rectified Flow Models.