Curly Flow Matching (Curly-FM)
- Curly-FM is a geometric flow matching approach that learns tailored path geometries via curved interpolants or non-gradient dynamics to address straightness versus curl.
- It optimizes numerical integration by designing vector fields that are straighter, reducing the number of function evaluations in tasks like image synthesis.
- Empirical results show improved low-NFE generative performance and faithful recovery of complex dynamics in applications such as cell cycles and fluid flows.
Searching arXiv for Curly Flow Matching and closely related flow-matching papers. Curly Flow Matching (Curly-FM) is a geometric variant of flow matching in which curvature is treated as a modeling variable rather than as a fixed byproduct of the conditional path design. In the literature surveyed here, the label is used in at least two distinct 2025 senses. One formulation learns curved interpolants so that the induced velocity field becomes straighter, with the explicit goal of reducing numerical integration complexity and improving low-NFE generation in image synthesis (Shankar et al., 26 Mar 2025). Another formulation uses a Schrödinger bridge with a non-zero drift reference process to learn non-gradient, periodic, or rotational dynamics from population snapshots and approximate velocity information (Petrović et al., 30 Oct 2025). Related work on one-step distillation, federated coupling design, and Lie-group interpolation clarifies the broader geometric setting in which Curly-FM is situated (Huang et al., 2024, Wang et al., 25 Sep 2025, Sherry et al., 1 Apr 2025).
1. Terminological scope and geometric theme
The term is not fully standardized. In one usage, Curly-FM “curls” the interpolant so that the learned flow becomes straighter; in another, it retains curl-like or rotational structure because straight or gradient-like transport is an inappropriate inductive bias for the underlying dynamics (Shankar et al., 26 Mar 2025, Petrović et al., 30 Oct 2025). A common denominator is that both formulations intervene at the level of probability-path geometry rather than treating the forward interpolation as fixed.
| Usage | Core mechanism | Stated purpose |
|---|---|---|
| Learned-interpolant Curly-FM | Learn a parametric interpolant and regularize the induced field for straightness | Straighter vector fields and fewer function evaluations (Shankar et al., 26 Mar 2025) |
| Non-gradient Curly-FM | Use a Schrödinger bridge with non-zero reference drift and a neural path interpolant | Learn cyclic, rotational, or non-gradient dynamics (Petrović et al., 30 Oct 2025) |
| Adjacent geometric formulation | Replace Euclidean line segments with exponential curves on a Lie group | Intrinsic flow matching on Lie groups (Sherry et al., 1 Apr 2025) |
This terminological split matters because the same word “curly” points to opposite geometric intentions in the two main formulations. One bends the conditional path to straighten the ODE field; the other preserves non-straight dynamics because the target process itself is not gradient-like. A plausible implication is that Curly-FM is best understood as a geometric design principle within flow matching rather than as a single universally fixed algorithm.
2. Learned curved interpolants for straight vector fields
In "Learning Straight Flows by Learning Curved Interpolants" (Shankar et al., 26 Mar 2025), standard flow matching is criticized for combining linear interpolants with the common independent coupling between and , which often induces a learned vector field that is substantially curved even when the interpolant is straight in data space. The paper argues that such curvature slows inference because numerical solvers require smaller step sizes to maintain accuracy.
The construction starts from conditional flow matching. Standard FM minimizes
and CFM replaces the intractable target field with a conditional one,
Curly-FM takes and replaces the usual linear interpolant by a learnable parametric interpolant
yielding the objective
Standard CFM is recovered when
The method is formulated as a bi-level optimization problem: 0 The central idea is to choose 1 so that the optimal field is as constant-along-trajectories as possible. By differentiating the straight-flow condition along a trajectory, the paper motivates the regularizer
2
A key technical result is an analytic expression for the optimal target field induced by the interpolant: 3 Using change of variables, the paper gives an explicit formula involving the interpolant inverse and the determinant of its Jacobian. The significance of that expression is structural: altering 4 changes the posterior over 5 given 6, which in turn changes the averaged velocity 7. The target of learning is therefore not merely a smoother path in data space, but a different conditional geometry that induces a straighter vector field.
Exact differentiation through 8 is costly, so the paper proposes an approximate practical objective,
9
where 0 denotes stop-gradient. For scalability, 1 is parameterized with an invertible architecture inspired by normalizing flows, specifically a GLOW/1x1 convolution model, because inversion and Jacobian determinants are required by the analytic target-field formula. Training is summarized in Algorithm 1: sample 2, sample independent pairs 3, generate interpolants 4, estimate 5 by Monte Carlo with 6 target samples, compute the loss, and update 7 with SGD/Adam. The paper emphasizes that this remains end-to-end and simulation-free during training (Shankar et al., 26 Mar 2025).
3. Non-gradient field dynamics via non-zero-drift Schrödinger bridges
In "Curly Flow Matching for Learning Non-gradient Field Dynamics" (Petrović et al., 30 Oct 2025), Curly-FM is a simulation-free trajectory inference and generative modeling method designed for systems whose dynamics have curl, rotational components, or periodic cycles. The paper explicitly contrasts this setting with methods grounded in least action or minimum kinetic energy, arguing that those induce gradient-field dynamics and are therefore ill-suited to processes such as cell cycles, vortices, and ocean currents.
The learned process is written as
8
while the reference process is
9
The bridge objective is
0
where 1 is the path measure of the non-zero-drift reference process. This is the decisive departure from standard diffusion Schrödinger bridges, which typically use a zero-drift Brownian reference. The intended effect is that the learned bridge matches endpoint marginals while remaining close to a reference dynamic that can already encode cyclic or rotational motion.
The reference drift 2 is constructed from approximate instantaneous velocities, such as RNA velocity, finite-difference velocity estimates in computational fluid dynamics, or observed ocean-current velocities. The paper states that 3 is built from observed velocity data through a kernel 4, giving as one example
5
This kernelization step supplies a smoothed reference drift against which the learned trajectories are regularized.
Training proceeds in two simulation-free stages. First, the method learns a neural path interpolant
6
with derivative
7
and fits it to the reference drift by minimizing
8
Second, it defines a transport cost
9
approximates that cost by Monte Carlo, estimates a coupling 0, and then trains the drift field with
1
When diffusion is nonzero, the model also learns a score through
2
with
3
and the full objective is
4
The paper positions this formulation against several nearby approaches. Standard CFM and OT-CFM learn vector fields from endpoint pairs but remain tied to straight or OT-biased interpolations. Diffusion Schrödinger bridge methods use a Brownian reference process. Generalized Schrödinger Bridge Matching is described as iterative and more expensive. Metric Flow Matching uses a similar two-stage structure but biases toward manifold geometry rather than toward reference dynamics. Curly-FM’s distinguishing claim is therefore not merely that it allows curved paths, but that it uses a non-zero drift reference process to learn non-gradient dynamics in a two-stage, non-iterative, simulation-free manner (Petrović et al., 30 Oct 2025).
4. Position within the broader flow-matching literature
Curly-FM is best understood relative to adjacent geometric and acceleration-oriented variants of flow matching. "Flow Generator Matching" (Huang et al., 2024) addresses the speed problem by a different route: it distills a pretrained flow model into a one-step generator 5 whose induced distribution matches the teacher flow. Its central result is a tractable surrogate
6
with gradient equivalence to an intractable flow-matching objective. The paper explicitly contrasts FGM with trajectory-straightening methods such as ReFlow and CFM. The distinction is important: FGM is not about changing the forward path geometry itself, but about one-step distillation of a teacher flow.
"Federated Flow Matching" (Wang et al., 25 Sep 2025) is not named Curly-FM, but it isolates the same geometric issue at the level of coupling. With independent coupling,
7
the paper states that trajectories are typically curved and therefore require many ODE steps at inference. FFM-LOT introduces local OT couplings to improve straightness within each client, whereas FFM-GOT uses the semi-dual OT formulation and a shared global potential 8 to recover global OT geometry in a federated way. The paper’s geometric storyline is explicit: independent coupling yields curved trajectories, local OT makes them straighter locally but can lose global consistency under heterogeneity, and global OT gives the straightest trajectories and the best few-step generation.
"Flow Matching on Lie Groups" (Sherry et al., 1 Apr 2025) contributes another nearby geometric perspective. It generalizes FM from Euclidean space to Lie groups by replacing line segments with exponential curves
9
and defining the conditional vector field
0
The paper proves that the gradients of the intractable FM loss and the conditional loss on the group coincide, extending the standard conditional trick to Lie-group settings. This is adjacent to Curly-FM because it also makes path geometry a first-class design choice, but its goal is intrinsic interpolation on non-Euclidean state spaces rather than straightness regularization or non-gradient reference dynamics.
Taken together, these papers mark three distinct axes in the flow-matching ecosystem: distillation of a teacher flow into one step, coupling design to affect trajectory straightness, and intrinsic path construction on structured state spaces. Curly-FM occupies the second axis most directly, but the non-gradient variant also extends that axis by arguing that curl and periodicity can be signal rather than defect (Huang et al., 2024, Wang et al., 25 Sep 2025, Sherry et al., 1 Apr 2025).
5. Empirical regimes and reported behavior
The learned-interpolant Curly-FM reports improved low-NFE image generation on several benchmarks (Shankar et al., 26 Mar 2025). On CIFAR-10, it achieves FID 4.61 with 2 NFE, improving over Consistency Model at 5.83, Consistency Flow Matching at 5.34, and rectified flow variants. On ImageNet 1, it reports FID 5.58 at 4 NFE and 3.84 at 12 NFE, outperforming MultiSample FM and Neural Flow Diffusion Models in the low-step regime. On CelebA-HQ 2, it reports FID 28.6 at 6 NFE, again better than ReFlow and Consistency Flow Matching. Figure 1 gives the key qualitative argument on a 2-Gaussian-to-2-Gaussian problem: red linear interpolants in standard FM induce a more curved flow, whereas learned interpolants in Curly-FM bend so that the resulting vector field is much straighter. The paper links these gains directly to integration complexity, stating that nearly constant-along-trajectory vector fields permit larger solver steps and, in the ideal straight case, one-step or very low-NFE sampling.
The non-gradient Curly-FM evaluates a different class of problems (Petrović et al., 30 Oct 2025). On synthetic asymmetric circles with a circular reference field, CFM and OT-CFM produce mostly straight trajectories, while Curly-FM learns circular trajectories consistent with the reference rotation. In the human fibroblast cell-cycle setting, Curly-FM is described as the only method able to learn the cell cycle in the visualizations, and it achieves the best cosine distance to the reference velocity field in the main table. On mouse erythroid development, it reconstructs a curved developmental trajectory more faithfully than OT-CFM and CFM. In the 2D Taylor-Green vortex example, Curly-FM learns longer, more intricate particle paths that resemble the fluid’s rotational dynamics; quantitatively it improves cosine distance and precision@k, and matches or improves MSE against baselines. For Gulf of Mexico ocean currents from HYCOM data, Curly-FM outperforms OT-CFM, vanilla Schrödinger bridge, and often SBIRR on Earth Mover’s Distance, cosine distance to the reference drift, and 3 cost. The paper also highlights an efficiency contrast: Curly-FM is reported in minutes, whereas TrajectoryNet and SBIRR are reported in hours.
These empirical programs are not directly comparable because they optimize for different notions of faithfulness. The learned-interpolant version is evaluated by low-step generative quality on images; the non-gradient version is evaluated by alignment with reference dynamics as well as by marginal transport quality. This suggests that “performance” under the Curly-FM label depends on whether the central desideratum is fast few-step sampling or faithful recovery of non-gradient trajectories.
6. Conceptual distinctions, misconceptions, and limitations
A frequent source of confusion is that curvature plays opposite roles in the two main formulations. In the learned-interpolant version, the path is allowed to curve so that the induced vector field becomes straighter (Shankar et al., 26 Mar 2025). In the non-gradient version, curl-like behavior is itself the object to be learned, because a straight or minimum-energy bias would erase periodic or rotational dynamics (Petrović et al., 30 Oct 2025). Accordingly, “curly” does not mean the same geometric property in both cases.
A second misconception is to equate Curly-FM with generic acceleration of flow matching. FGM is the cleaner answer to one-step distillation: it compresses a pretrained flow into a single forward pass by matching induced flows with a gradient-equivalent objective, rather than by redesigning the conditional path geometry (Huang et al., 2024). Likewise, OT-based federated flow matching should not be read as a Curly-FM method, even though it studies the same curved-versus-straight trajectory issue through the lens of couplings (Wang et al., 25 Sep 2025).
The limitations of the learned-interpolant formulation are primarily algorithmic. Exact differentiation through the analytic target field 4 is costly, which is why the paper resorts to an approximate objective based on a neural approximation 5. The analytic expression also requires inversion and Jacobian determinants of the interpolant, motivating an invertible GLOW/1x1 convolution parameterization. The paper notes that simpler forms can make the Jacobian independent of 6, but may be too limited for complex data (Shankar et al., 26 Mar 2025).
The limitations of the non-gradient formulation are tied to the quality of the external velocity information. The paper explicitly states that Curly-FM depends on the quality of the inferred reference field, that long timescales make the problem harder, that the method does not fully address unbalanced transport or manifold structure, and that it is not designed to discover dynamics from scratch without useful velocity information. It also reports that increasing stochasticity 7 reduces performance and frames the method mainly for low-noise settings (Petrović et al., 30 Oct 2025).
Within the current literature, Curly-FM therefore names a geometric intervention in flow matching rather than a single settled object. One branch uses curved interpolants to obtain straighter flows and lower NFE; another uses non-zero-drift Schrödinger bridges to recover non-gradient dynamics that straight-flow methods miss. The shared thesis is that the geometry of the conditional path—or of the reference process—should be learned or specified in accordance with the target phenomenon, rather than fixed a priori.