Riemannian Flow Matching
- Riemannian Flow Matching is a geometric framework that constructs generative models on manifolds by learning flows aligned with intrinsic curvature, geodesic distances, and topology.
- It employs closed-form geodesics and simulation-free training objectives, bypassing traditional score estimation and divergence computation for efficient probability transport.
- RFM underpins state-of-the-art applications in geometry, molecular design, and robotics, offering high fidelity, speed, and robust theoretical convergence guarantees.
Riemannian Flow Matching (RFM) is a geometric framework for constructing generative models that transport probability distributions on manifolds by learning flows consistent with the manifold's Riemannian structure. Unlike traditional flow-based models defined in Euclidean space, RFM captures intrinsic geometric properties—such as curvature, geodesic distance, and manifold topology—allowing simulation-free, divergence-free training objectives and efficient, high-fidelity generative modeling on a broad class of manifolds. The framework generalizes core concepts from continuous normalizing flows, optimal transport, and conditional flow matching, and underpins a growing class of state-of-the-art models in geometric deep learning, discrete data modeling, and generative tasks on structured domains.
1. Mathematical Foundation
Riemannian Flow Matching operates on a smooth, connected, complete Riemannian manifold of dimension , where is a smoothly varying metric tensor defining an inner product for . At the heart of RFM is the flow ODE: where is a time-dependent vector field, and is sampled from a simple base distribution supported on . The temporal evolution pushes 0 forward to a target distribution 1 through the induced diffeomorphism 2.
Mass conservation along the flow is governed by the Riemannian continuity equation: 3 where 4 denotes the intrinsic divergence associated with the volume form 5.
RFM introduces a premetric 6, typically chosen as the geodesic distance 7 induced by 8, which admits the exponential and logarithm maps: 9 such that 0 iff 1. Geodesics 2 encode shortest paths between points.
The target velocity field for conditional flow matching is derived by differentiating along geodesics: 3 The training objective is then the squared-norm error in the Riemannian metric: 4 This formulation bypasses the need for score estimation and divergence computation, and is simulation-free for manifolds with closed-form geodesics (Chen et al., 2023).
2. Extensions and Key Variants
2.1 Variational Riemannian Flow Matching
RG-VFM (Riemannian Gaussian Variational Flow Matching) generalizes variational flow matching to Riemannian manifolds by optimizing a KL-based variational objective for endpoint matching. The posterior is modeled as a Riemannian Gaussian: 5 with loss
6
On homogeneous spaces with closed-form geodesics, RG-VFM yields an unbiased geodesic-MSE loss and shares the computational advantages of RFM, but endpoint-matching is fundamentally distinct from the velocity-matching of RFM (Zaghen et al., 18 Feb 2025).
2.2 Flow Matching on Lie Groups
For matrix Lie groups 7, straight lines are replaced with exponential curves: 8 with corresponding tangent vectors
9
This approach exploits only closed-form group operations and supports fast simulation-free generative modeling on, e.g., 0 or 1 (Sherry et al., 1 Apr 2025).
2.3 Statistical Manifolds and Discrete Data
On the manifold of categorical distributions (the simplex 2) endowed with the Fisher–Rao geometry, geodesic interpolation maps to the positive orthant of the sphere via 3, with closed-form geodesics: 4 Riemannian Flow Matching on these manifolds yields state-of-the-art performance on discrete generative modeling by leveraging optimal transport and exact likelihoods (Cheng et al., 2024, Davis et al., 2024).
3. Algorithmic and Practical Considerations
A key benefit of RFM is the simulation-free, closed-form computation of the supervision signals for many simple geometries. The general algorithm for RFM training is:
- For each batch, sample 5, 6, 7.
- Construct geodesic interpolation 8.
- Compute the analytic target velocity 9.
- Predict 0 with a neural network projecting into the appropriate tangent space.
- Minimize the squared Riemannian-norm loss 1.
For manifolds lacking closed-form geodesics, RFM can use spectral approximations (eigenmaps and Laplace–Beltrami eigenfunctions) as surrogate premetrics (Chen et al., 2023, Huang et al., 2 Oct 2025).
Sampling requires integration of the trained vector field ODE: 2 with geodesic projection at each step to respect manifold constraints.
4. Theoretical Guarantees and Convergence
RFM is consistent in the sense that as the learned field 3 approaches the optimal conditional field, the pushforward law converges to the data law in distribution (Chen et al., 2023). Recent non-asymptotic analyses establish explicit total variation convergence rates of the form
4
where 5 is the Euler step size and 6 is the field approximation error (Guan et al., 5 Feb 2026). These results hold under smoothness and curvature conditions for compact and Hadamard manifolds, with explicit iteration complexities available for the hypersphere 7 and SPD8 matrices.
5. Applications and Empirical Advances
Riemannian Flow Matching forms the backbone of generative models in a diverse array of geometric domains:
- Geometry and shape synthesis: Learning flows on spheres, tori, hyperbolic spaces, and general surfaces yields high-fidelity data generation and tractable likelihood estimation (Chen et al., 2023, Davis et al., 24 Oct 2025).
- Protein, molecule, and material design: RFM powers multi-stage pipelines for molecular docking (Matcha (Frolova et al., 16 Oct 2025)), crystal-structure prediction (FlowMM (Miller et al., 2024)), and disordered material generation (DMFlow (Wu et al., 4 Feb 2026)).
- Graph and matrix data: Spectral Geodesic Flow Matching for graphs (SFMG (Huang et al., 2 Oct 2025)) and pullback-based RFM for SPD and correlation matrices (DiffeoCFM (Collas et al., 20 May 2025)).
- Robotics and control: RFM-based policies enable geometrically aware trajectory inference and real-time visuomotor control with competitive or superior smoothness and inference efficiency compared to diffusion models (Braun et al., 2024, Ding et al., 2024).
- Discrete and statistical manifolds: Fisher–Rao RFM underlies SOTA models for categorical and biological sequence data generation (Davis et al., 2024, Cheng et al., 2024).
Across these settings, RFM has demonstrated strong sample quality, simulation-free inference, geometric faithfulness, and efficient training.
6. Limitations, Open Directions, and Relations
Current implementations of RFM are most efficient when the target manifold admits closed-form geodesics and tractable exponential/logarithm maps; generalization to arbitrary manifolds often requires eigenfunction-based spectral distances (Chen et al., 2023). Limitations include potential scalability issues for very high dimensions (in the spectral case), and the need for further advances in self-distillation and few-step generative flows on highly curved or singular manifolds (Davis et al., 24 Oct 2025).
RFM is fundamentally distinct from stochastic score-based generative models (diffusions), simulation-heavy continuous normalizing flows, or variational endpoint-matching frameworks (RG-VFM (Zaghen et al., 18 Feb 2025)). The velocity-matching loss in RFM is unbiased and directly expresses the conditional optimal transport flow on the manifold, in contrast to endpoint-based objectives which differ outside of Euclidean space.
7. Empirical Benchmarks and Comparative Performance
A wide range of empirical studies have confirmed RFM's advantages:
- On the hypersphere 9, RFM and geometric variants achieve 0 compared to 1 for Euclidean flows;
- In molecular docking, RFM-based Matcha achieves 2\% success (RMSD3) on Astex versus 4\% for AlphaFold 3, and is 5 faster than co-folding models (Frolova et al., 16 Oct 2025);
- In dense graph generation, SFMG matches state-of-the-art on degree, spectral, and clustering metrics with up to 6 speedup over diffusion models (Huang et al., 2 Oct 2025);
- For brain connectivity matrices, DiffeoCFM yields superior alignment to manifold constraints and F1-scores compared to post-projection diffusion and flow baselines (Collas et al., 20 May 2025);
- On real-world discrete-sequence benchmarks, RFM-based categorical models surpass Dirichlet and discrete-diffusion flows in NLL and domain-specific metrics (Davis et al., 2024, Cheng et al., 2024).
RFM thus defines a mathematically rigorous, computationally efficient, and empirically validated framework for manifold-aware generative modeling across continuous, discrete, and structured data domains.