Learning Riemannian Metrics

Updated 20 December 2025

Learning Riemannian metrics is the process of defining and optimizing smoothly varying positive-definite inner products on manifolds to compute distances via geodesics.
Approaches include constant, kernel-based, neural network, and pullback methods that adapt metrics to local data geometry, boosting representation accuracy.
Applications span manifold learning, generative modeling, optimal transport, and domain adaptation, leading to improved interpolation, clustering, and classification.

A Riemannian metric is a smoothly varying positive-definite inner product on each tangent space of a differentiable manifold, inducing a notion of length, area, volume, and most crucially—distance—via geodesic curve minimization. Learning Riemannian metrics refers to the data-driven or model-driven process of defining, estimating, or optimizing these metrics to more faithfully capture the intrinsic geometry of datasets or latent representations, particularly in scenarios where traditional Euclidean or Mahalanobis metrics are fundamentally inadequate. The field encompasses explicit matrix parameterizations, neural and kernel-based metric fields, pullback via generative mappings, and optimization through geometric principles on spaces of positive-definite matrices. Learned metrics yield improved fidelity in representation learning, optimal transport, generative modeling, trajectory inference, and a wide range of machine learning tasks sensitive to the underlying geometry.

1. Foundations of Riemannian Metric Learning

A Riemannian metric associates to each point $p$ of a manifold $M$ a symmetric positive-definite bilinear form $g_p: T_pM \times T_pM \to \mathbb{R}$ . The geodesic distance induced by $g$ is computed by solving

$d_g(x, y) = \inf_{\gamma(0)=x,\,\gamma(1)=y} \int_0^1 \sqrt{g_{\gamma(t)}(\dot\gamma(t), \dot\gamma(t))}\,\mathrm{d}t,$

where $\gamma$ is any piecewise $C^1$ curve on $M$ , and the minimum is achieved by a geodesic satisfying the Levi-Civita ODE $\nabla_{\dot\gamma}\dot\gamma = 0$ (Gruffaz et al., 7 Mar 2025).

Metric learning in the Riemannian setting fundamentally differs from classical approaches by allowing the metric $g$ to vary over $M$ , enabling local adaptivity to data structure. Typical motivations include compensating for manifold curvature, density heterogeneity, nonlinear embedding geometry, and enforcing invariance properties.

Key tools include exponential and logarithm maps for moving between tangent spaces and the manifold, parallel transport for comparing tangent vectors, and the metric volume form $\mathrm{dvol}_g = \sqrt{\det g_x}\,\mathrm{d}x$ for density modeling (Lebanon, 2012).

2. Parametrization Strategies and Model Classes

The landscape of Riemannian metric learning models is broad, with the following principal parametric strategies (Gruffaz et al., 7 Mar 2025, Scarvelis et al., 2022):

Approach	Typical Parameterization	Notes
Constant global metric	$g_x \equiv G\, (G\succ0)$	Mahalanobis; convex in $G$
Piecewise-constant	$g_x = \sum_\ell G_\ell\, \mathbb I_{V_\ell}(x)$	Cluster-wise metric blocks
Kernel-based metric field	$g_x = \sum_{\ell=1}^L G_\ell\, k(x, c_\ell)$	Smooth interpolation by kernels
Neural network field	$g_x = A_\theta(x)^\top A_\theta(x) + \varepsilon\,I$	Fully learnable, ensures $g_x \succ 0$
Pullback via map $f$	$g_x(v, w) = g^N_{f(x)} \left(Df(x)\,v, Df(x)\,w\right)$	Latent variable/deep generative modeling

Pullback metrics are especially prominent in latent-variable models and manifold learning, where the decoder or embedding map $f$ implicitly induces a metric on the latent space or embedded manifold by pulling back the ambient metric of the output space. This approach generalizes kernel Mahalanobis learning, autoencoder representations, and Gaussian Process Latent Variable Models (Tosi et al., 2014, Rozo et al., 7 Mar 2025).

3. Optimization Principles and Metric Estimation Algorithms

Metric learning objectives generalize standard contrastive and triplet losses to geodesic distances $d_g$ , enabling supervised, semi-supervised, or unsupervised fitting of $g$ (Gruffaz et al., 7 Mar 2025).

Explicit Metric Learning

In SPD matrix manifolds (e.g., covariance features), the log-Euclidean and affine-invariant metrics provide closed-form geodesic distances. One learns a Mahalanobis metric $W$ on the log-domain via ITML subject to similarity/dissimilarity constraints and regularization (Vemulapalli et al., 2015). The optimization employs Bregman projections and is convex in $W$ .
Adaptive Log-Euclidean Metrics (ALEMs) further generalize by parameterizing the base of logarithms for each eigenvalue; this introduces learnable parameters $\theta$ that adapt to data, yielding a family of flat metrics and enabling integration and optimization in deep networks (Chen et al., 2023).

Volume-based and Density-driven Learning

Volume contraction methods maximize the inverse Riemannian volume at observed data points, formulating a log-likelihood objective under an "inverse volume" density: $p(x; \theta) = \frac{1}{\sqrt{\det g_\theta(x)}} \Big/ Z(\theta), \quad Z(\theta) = \int_M \frac{1}{\sqrt{\det g_\theta(x)}}\,d\mathrm{vol}_{g_\theta}(x).$ This approach yields strictly concave loss functions and unique global optimizers for parametric metric families in settings such as the multinomial simplex (Lebanon, 2012).

Metric Learning via Optimal Transport

Optimal transport-based models simultaneously learn both the ground metric (defining transport costs) and the transport plan by alternating optimization. The metric tensor $g_\theta(x)$ may be neurally parameterized (inverse as $Q_\theta(x)^\top Q_\theta(x) + \varepsilon\,I$ ). The optimization leverages the dual 1-Wasserstein objective and regularization of $A_\theta$ to avoid degenerate metrics. Learned metrics yield nonlinear interpolations and geodesic paths in the context of probability measures, yielding improved trajectory inference (Scarvelis et al., 2022, Jawanpuria et al., 16 Sep 2024).

4. Metric Estimation in Latent Variable and Manifold Models

When smooth mappings $f: \mathbb{R}^q \to \mathbb{R}^p$ (or into a Riemannian manifold $(M, g)$ ) are fitted to data (e.g., in GP-LVMs, VAEs, or WGPLVMs), the pullback metric on latent space is

$g(x) = J(x)^\top J(x),$

where $J(x) = \partial f / \partial x$ is the Jacobian. In probabilistic latent models where $f$ is random (e.g., Gaussian process prior), the metric tensor $g(x)$ becomes a random variable (Wishart distributed), and one works with its expectation,

$\mathbb{E}[g(x)] = \mathbb{E}[J(x)]^\top \mathbb{E}[J(x)] + p\,\mathrm{Cov}(J(x), J(x))$

to define the geometry (Tosi et al., 2014, Rozo et al., 7 Mar 2025).

Learning the pullback Riemannian metric is essential for geometry-aware interpolation and geodesic computation in generative modeling and for correct synthesis of data sequences (robot motion, functional connectome trajectories, etc.). Wrapped GP models employ tangent-space Gaussian processes "wrapped" via the exponential map to ensure outputs lie on curved manifolds (spheres, SPD, etc.), and provide closed-form formulas for expected metric tensors (Rozo et al., 7 Mar 2025).

5. Applications and Empirical Outcomes

The successful learning of Riemannian metrics has produced measurable advances across several fields:

Dimension reduction and manifold learning: Estimating push-forward metrics enables recovery of true geometric quantities in embedding spaces, correcting for distortions and allowing true distance, angle, area, and volume computations. The Laplacian-based estimator of Perrault-Joncas & Meilă augments manifold learning outputs with estimated Riemannian metrics, guaranteeing geometrically meaningful embeddings (Perraul-Joncas et al., 2013).
Probabilistic generative modeling: Geodesic-based latent interpolation guided by the expected pullback metric leads to more realistic data generation, particularly in image and motion synthesis tasks, avoiding high-uncertainty regions and preserving invariants such as limb lengths in motion-capture data (Tosi et al., 2014).
Optimal transport and domain adaptation: Learning a data-adaptive ground metric on the SPD matrix manifold improves performance in transfer learning and domain adaptation, with robustness to label skew or domain shifts (Jawanpuria et al., 16 Sep 2024).
Causal inference and trajectory fitting: Riemannian metrics provide finer control over longitudinal and causal matching, with metric learning yielding lower bias and improved fit in population-level models (Gruffaz et al., 7 Mar 2025).
Deep neural networks on SPD manifolds: Adaptively learned log-Euclidean metrics (via ALEM) are integrated into SPDNet architectures, improving classification accuracy and robustness with minimal computational overhead (Chen et al., 2023).

Empirical results across these domains consistently demonstrate that adapting the metric to the data's geometry yields superior interpolations, reconstructions, clustering, and classification compared to any fixed Euclidean or Mahalanobis metric.

6. Computational and Theoretical Considerations

Metric learning in the Riemannian domain spans a wide range of computational complexity, from convex Bregman projections for log-Euclidean metrics in moderate dimensions (Vemulapalli et al., 2015), to scalable neural network optimization for spatially varying metric fields (Scarvelis et al., 2022), and efficient Laplacian-based estimators for manifold learning (Perraul-Joncas et al., 2013). In SPD manifolds and ground metric learning, affine-invariant geometry confers closed-form geodesic distances and geometric mean solutions (Jawanpuria et al., 16 Sep 2024).

Regularization schemes (e.g., log-det penalty, Frobenius norm, inverse volume) are vital for statistical identifiability and numerical stability. Implementation requires differentiation through ODE-based geodesic solvers, matrix exponentials/logarithms, and optimization on Riemannian manifolds (via retractions or exponential maps) (Gruffaz et al., 7 Mar 2025).

Theoretical properties—strict concavity of log-likelihood, guarantee of global optima, statistical consistency under sufficient sampling density, and smoothness of learned metrics—have been established for principal models on the simplex, SPD matrices, and pullback metrics.

7. Open Problems and Future Directions

Despite substantial advances, several challenges remain (Gruffaz et al., 7 Mar 2025):

Extending statistical learning theory to nonlinear, neural, and kernelized metric families;
Developing scalable and provably accurate approximations for geodesic computation in high-dimensional spaces;
Integrating curvature (Ricci, sectional) into loss and generalization analysis;
Designing geometry-aware second-order optimizers for infinite-dimensional metric manifolds;
Creating unified software libraries capable of handling manifold operations, metric learning, and their integration into downstream machine learning workflows.

Riemannian metric learning fundamentally enriches the modeling and analysis toolkit, enabling the encoding of intrinsic data geometry and supporting complex downstream inference tasks previously unattainable with classical metrics.