Papers
Topics
Authors
Recent
2000 character limit reached

Generalised Flow Maps for Few-Step Generative Modelling on Riemannian Manifolds (2510.21608v1)

Published 24 Oct 2025 in cs.LG

Abstract: Geometric data and purpose-built generative models on them have become ubiquitous in high-impact deep learning application domains, ranging from protein backbone generation and computational chemistry to geospatial data. Current geometric generative models remain computationally expensive at inference -- requiring many steps of complex numerical simulation -- as they are derived from dynamical measure transport frameworks such as diffusion and flow-matching on Riemannian manifolds. In this paper, we propose Generalised Flow Maps (GFM), a new class of few-step generative models that generalises the Flow Map framework in Euclidean spaces to arbitrary Riemannian manifolds. We instantiate GFMs with three self-distillation-based training methods: Generalised Lagrangian Flow Maps, Generalised Eulerian Flow Maps, and Generalised Progressive Flow Maps. We theoretically show that GFMs, under specific design decisions, unify and elevate existing Euclidean few-step generative models, such as consistency models, shortcut models, and meanflows, to the Riemannian setting. We benchmark GFMs against other geometric generative models on a suite of geometric datasets, including geospatial data, RNA torsion angles, and hyperbolic manifolds, and achieve state-of-the-art sample quality for single- and few-step evaluations, and superior or competitive log-likelihoods using the implicit probability flow.

Summary

  • The paper introduces Generalised Flow Maps, a novel framework for efficient few-step generative modeling on arbitrary Riemannian manifolds.
  • The method extends Euclidean flow maps to the manifold setting using innovative self-distillation objectives (G-LSD, G-ESD, G-PSD) to improve training stability.
  • Empirical results reveal up to 22× improvement in MMD and competitive NLLs across datasets including proteins, geospatial data, SO(3), and hyperbolic spaces.

Generalised Flow Maps for Few-Step Generative Modelling on Riemannian Manifolds

Introduction and Motivation

The paper introduces Generalised Flow Maps (GFM), a framework for few-step generative modelling on arbitrary Riemannian manifolds. The motivation stems from the computational inefficiency of existing geometric generative models, which require many steps of numerical simulation at inference due to their reliance on dynamical measure transport (e.g., diffusion models, flow matching). This inefficiency is exacerbated in geometric settings, where each simulation step involves numerically unstable and expensive manifold operations (e.g., exponential/logarithmic maps for Lie groups). GFM generalises the flow map paradigm from Euclidean spaces to Riemannian manifolds, enabling accelerated inference with high sample fidelity and competitive likelihoods.

Theoretical Framework

Riemannian Flow Maps

GFM extends the concept of flow maps to Riemannian manifolds (M,g)(\mathcal{M}, g), where gg is the metric tensor. The probability flow ODE on a manifold is given by:

txt=vt(xt),tρt(x)=divg(ρt(x)vt(x))\partial_t x_t = v_t(x_t), \quad \partial_t \rho_t(x) = -\mathrm{div}_g(\rho_t(x) v_t(x))

where vtv_t is a time-dependent vector field on the tangent bundle TM\mathcal{T}\mathcal{M}, and divg\mathrm{div}_g is the Riemannian divergence.

The GFM is defined as a map Xs,t:MMX_{s,t}: \mathcal{M} \to \mathcal{M} such that for any solution (xt)t[0,1](x_t)_{t \in [0,1]} of the flow, Xs,t(xs)=xtX_{s,t}(x_s) = x_t. The natural parametrisation is:

Xs,t(xs)=expxs((ts)vs,t(xs))X_{s,t}(x_s) = \exp_{x_s}((t-s) v_{s,t}(x_s))

where expxs\exp_{x_s} is the exponential map at xsx_s.

Characterisations and Self-Distillation

Three equivalent characterisations of the GFM are established:

  1. Generalised Lagrangian Condition: tXs,t(xs)=vt(Xs,t(xs))\partial_t X_{s,t}(x_s) = v_t(X_{s,t}(x_s))
  2. Generalised Eulerian Condition: sXs,t(xs)+d(Xs,t)xs[vs(xs)]=0\partial_s X_{s,t}(x_s) + d(X_{s,t})_{x_s}[v_s(x_s)] = 0
  3. Generalised Semigroup Condition: Xu,t(Xs,u(xs))=Xs,t(xs)X_{u,t}(X_{s,u}(x_s)) = X_{s,t}(x_s)

These conditions yield three self-distillation objectives for training GFM: Generalised Lagrangian Self-Distillation (G-LSD), Generalised Eulerian Self-Distillation (G-ESD), and Generalised Progressive Self-Distillation (G-PSD). Each objective enforces the corresponding characterisation, and the use of stop-gradient operators stabilises training by treating one term as a "teacher".

Implementation and Training

The GFM framework is instantiated with neural networks that output tangent vectors, projected onto the appropriate tangent space to ensure manifold constraints. Training involves sampling pairs (x0,x1)(x_0, x_1) from a coupling of the reference and target distributions, constructing interpolants via geodesics, and minimising the chosen self-distillation loss (optionally combined with a Riemannian Flow Matching loss on the diagonal).

Pseudocode for GFM training is provided, highlighting the modularity of the approach for different self-distillation objectives. The framework is compatible with modern autodiff libraries, leveraging forward-mode differentiation and Jacobian-vector products.

Empirical Evaluation

GFM is benchmarked on a suite of geometric datasets:

  • Protein and RNA torsion angles (flat tori T2\mathbb{T}^2, T7\mathbb{T}^7)
  • Geospatial catastrophes (Earth's surface S2\mathbb{S}^2)
  • Synthetic SO(3)SO(3) rotations
  • Hyperbolic geometry (Poincaré disk)

Sample quality is assessed via Maximum Mean Discrepancy (MMD) using the manifold's geodesic distance, and likelihoods are evaluated via negative log-likelihood (NLL) when available.

Protein Torsion Angles

GFM achieves superior or competitive NLLs compared to Riemannian Flow Matching (RFM) and other baselines. Notably, in the one-step regime (single function evaluation), GFM yields up to 22×22\times improvement in MMD over RFM. Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1: Ramachandran plots on the General protein dataset, comparing GFM variants and RFM; test-set samples are shown in red.

Figure 2

Figure 2: MMD on protein datasets as a function of the number of function evaluations (NFE), demonstrating GFM's efficiency in the low-NFE regime.

Geospatial Data on S2\mathbb{S}^2

GFM outperforms RFM and other baselines on three out of four Earth datasets in terms of NLL, and consistently achieves lower MMD for few-step inference. Figure 3

Figure 3: MMD on Earth datasets against the NFE, showing accelerated convergence for GFM methods.

Figure 4

Figure 4

Figure 4: Density plots for volcano, earthquake, flood, and fire datasets; test-set samples in red, comparing GFM and RFM.

SO(3)SO(3) and Hyperbolic Manifolds

On SO(3)SO(3), GFM matches or exceeds RFM in MMD and NLL, demonstrating robustness to non-trivial manifold topology. On the hyperbolic Poincaré disk, GFM variants outperform RFM in MMD, especially in the few-step regime. Figure 5

Figure 5: MMD on the hyperbolic dataset as a function of NFE, highlighting GFM's sample quality.

Practical and Theoretical Implications

GFM provides a principled and efficient approach for generative modelling on manifolds, enabling rapid inference with few steps and high sample fidelity. The framework unifies and extends Euclidean few-step models (consistency models, shortcut models, meanflows) to the Riemannian setting, and is compatible with arbitrary manifold structures, including Lie groups and spaces with non-trivial curvature.

The empirical results indicate that the Lagrangian self-distillation objective is most effective for high-quality sample generation, a phenomenon warranting further theoretical investigation. The implicit flow within the flow map can yield better likelihoods than direct flow matching, suggesting new directions for understanding the interplay between flow map parametrisation and generative performance.

Future Directions

Potential avenues for future research include:

  • Theoretical analysis of the superiority of the Lagrangian objective in the manifold setting
  • Extension to conditional generative modelling and structured data (e.g., graphs, molecules)
  • Integration with adaptive and parallel sampling schemes for further acceleration
  • Exploration of equivariant parametrisations for Lie groups and other symmetric spaces
  • Application to scientific domains requiring geometric inductive bias (e.g., protein design, material science)

Conclusion

Generalised Flow Maps constitute a versatile and efficient class of generative models for geometric data, supporting few-step inference on arbitrary Riemannian manifolds. The framework is theoretically grounded, empirically validated, and opens new possibilities for scalable generative modelling in domains where manifold structure is intrinsic. The modularity of the self-distillation objectives and compatibility with modern autodiff tools make GFM readily applicable to a wide range of problems in geometric deep learning and scientific machine learning.

Whiteboard

Paper to Video (Beta)

Open Problems

We found no open problems mentioned in this paper.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 744 likes about this paper.