Riemannian Metric Learning

Updated 20 December 2025

Riemannian metric learning is a set of techniques that optimizes local tensor metrics on smooth manifolds to handle non-Euclidean data effectively.
It utilizes advanced optimization methods over SPD matrices and pullback strategies in deep models for enhanced classification, domain adaptation, and generative tasks.
Empirical studies show significant improvements (e.g., up to 20% in classification and clustering accuracy) while ensuring robustness in noisy or high-dimensional settings.

Riemannian metric learning is the set of methods for estimating, adapting, or optimizing Riemannian metrics in machine learning problems where non-Euclidean data geometry is crucial. Unlike classical (Euclidean) metric learning—which selects a fixed global positive-definite matrix—Riemannian metric learning operates on smooth manifolds, equipping each point with a symmetric positive-definite tensor encoding local inner products on tangent spaces. This general framework supports a wide range of applications, from manifold-aware classification and generative modeling to optimal transport, representation learning, and domain adaptation (Gruffaz et al., 7 Mar 2025). Key advances include convex and non-convex optimization on matrix manifolds, learned ground metrics for transport, geometric mean closed-form algorithms, deep neural pullback metrics, and strategies for isometric immersion and robustness. The following sections review foundational concepts, learning objectives, computational procedures, and empirical phenomena across leading subfields.

1. Mathematical Foundations: Riemannian Metrics and Geodesics

A Riemannian metric on a smooth $d$ -dimensional manifold $M$ is a smooth assignment $x \mapsto G(x) \in S_{++}^d$ of positive-definite symmetric tensors, inducing a coordinate-dependent inner product $g_x(\cdot, \cdot)$ on the tangent space $T_xM$ (Gruffaz et al., 7 Mar 2025). For a finite-dimensional family $\mathcal{G}_\theta = \{g_\theta(x)\}$ , the metric determines key geometric quantities:

Geodesic length: For a curve $\gamma:[0,1]\to M$ , $L(\gamma) = \int_0^1 \sqrt{\dot\gamma(t)^\top G(\gamma(t))\dot\gamma(t)}\,dt$ .
Riemannian distance: $d_g(x,y) = \inf_{\gamma(0)=x,\,\gamma(1)=y} L(\gamma)$ .
Volume element: $dV_g(x) = \sqrt{\det G(x)}\,dx^1\cdots dx^d$ .

On important manifolds (e.g., symmetric positive-definite matrices $S_{++}^n$ ), canonical metrics include the affine-invariant metric $g_X(V,W) = \operatorname{tr}(X^{-1}V X^{-1}W)$ and log-Euclidean metrics derived via pullback by the matrix logarithm (Gruffaz et al., 7 Mar 2025, Chen et al., 2023, Vemulapalli et al., 2015). The matrix geodesic $A \sharp_t B = A^{1/2}\left(A^{-1/2}BA^{-1/2}\right)^tA^{1/2}$ describes the shortest path between two SPD matrices under affine-invariant geometry (Zadeh et al., 2016).

2. Core Metric Learning Methodologies

Riemannian metric learning comprises supervised, semi-supervised, and unsupervised variants, often uniting classical objectives with manifold-specific geometry. Dominant paradigms include:

a) Covariance and Mahalanobis-based Learning on SPD Manifolds

Given sets of similar ( $S$ ) and dissimilar ( $D$ ) data pairs in $\mathbb{R}^d$ , classical Mahalanobis learning solves for $M \in S_{++}^d$ minimizing or maximizing pairwise squared distances, subject to convex or geodesically convex objectives. In Riemannian frameworks, covariances and geometric means play a central role:

Geometric Mean Metric Learning (GMML): Solves $A S A = D$ and finds the unique SPD solution $A^* = S^{-1/2}(S^{1/2}DS^{1/2})^{1/2}S^{-1/2}$ ; this $A^*$ is the geodesic midpoint between $S^{-1}$ and $D$ (Zadeh et al., 2016).
Log-Euclidean Metric Learning parameterizes the distance as $d_M^2(X,Y) = (u-v)^\top M (u-v)$ for $u=\mathrm{vec}(\log X)$ , $M \succeq 0$ , solved via ITML or regularized Bregman projection (Vemulapalli et al., 2015).
RGML: Adapts to class structure by learning per-class covariance matrices $\{\Sigma_k\}$ and a Riemannian barycenter $A$ . The Fréchet mean in the affine-invariant metric is strictly geodesically convex, with $\bar\Sigma = \arg\min_{X} \sum w_i d_R^2(X,\Sigma_i)$ (Collas et al., 2022).

b) Adaptive and Pullback Metrics

Adaptive Log-Euclidean Metrics (ALEMs): Generalize log-Euclidean metrics by learning eigenbasis-dependent log maps $\phi_\alpha(X)$ , producing closed-form distances $d_\mathrm{ALEM}(P,Q)=\|A[\log P-\log Q]\|_F$ with learnable $A$ (Chen et al., 2023).
Deep and Latent Models: In autoencoders and variational models, the Riemannian metric on latent coordinates $z$ is obtained by pullback of the observed data's natural metric via the decoder Jacobian, i.e., $G(z) = J_{g}(z)^\top J_{g}(z)$ (Rozo et al., 7 Mar 2025, Arvanitidis et al., 2021). Geometry-aware deep architectures parameterize and learn such pullback (or warped) metrics for complex tasks (Sun et al., 16 Oct 2024, Chen et al., 23 Sep 2024).
Surrogate Conformal Metrics: Approximates the pullback metric with $g(z) = \lambda(z)I$ where $\lambda(z)$ is learned through energy-based modeling of the latent data density, yielding robust geodesic computations in higher dimensions (Arvanitidis et al., 2021).

c) Metric Learning for Optimal Transport and Domain Adaptation

Ground Metric Learning: Approaches for optimal transport directly learn the SPD ground metric $M \in S_{++}^d$ underlying the transport cost $c_M(x,y) = (x-y)^\top M (x-y)$ . The metric is optimized jointly with OT plans via Riemannian-gradient or alternating schemes (Jawanpuria et al., 16 Sep 2024, Scarvelis et al., 2022), raising classification and adaptation accuracy beyond fixed costs.
Metric Fields: Generalization includes neural parametrization of spatially-varying metric tensors $M_\theta(x)$ , optimized via adversarial or primal-dual frameworks, enabling nonlinear interpolation between probability measures and capturing population-level geometric information (Scarvelis et al., 2022, Wang et al., 2020).

d) Kernel and Multi-Manifold Approaches

Riemannian Kernel Embeddings: Representation learning for structured objects (e.g., image sets) leverages kernelized metric learning in reproducing kernel Hilbert spaces using Riemannian distances (e.g., log-Euclidean, projection metrics). Joint Mahalanobis-like objectives learn low-dimensional projections in fused RKHSs (Wang et al., 2019, Huang et al., 2016).

e) Metric Estimation from Manifold Data

Non-parametric Metric Recovery: Algorithms estimate the Riemannian metric tensor in embedded coordinate systems by Laplace-Beltrami operator inversion, correcting for embedding distortion and unleashing geometric computations (lengths, angles, areas) on learned representations (Perraul-Joncas et al., 2013).

3. Optimization Algorithms and Geometric Procedures

Riemannian metric learning engages specialized optimization techniques adapted to matrix or function manifolds, with key components:

Riemannian gradient descent: Standard update $X_{t+1} = \operatorname{Exp}_X(-\eta\;\operatorname{grad}F(X))$ leveraging manifold exponential and logarithm maps, supporting both SPD matrices and general manifold settings (Gruffaz et al., 7 Mar 2025, Jawanpuria et al., 16 Sep 2024).
Proximal and primal-dual algorithms: For constrained or regularized objectives, proximal mapping and dual ascent steps are carried out intrinsically on the manifold, using closed-form retractions, QR or SVD-based updates, and slack variable projections (Wang et al., 2020, Dutta et al., 2020).
Geodesic computation: Shortest-path or ellipsoidal structures (e.g., on SPD manifolds) are harnessed for direct calculation of distances and means. Neural ODEs, graph-based Dijkstra, or energy minimization approaches are used for general or learned metrics, including in deep and generative settings (Rozo et al., 7 Mar 2025, Sun et al., 16 Oct 2024).
Kernel and RKHS optimization: Multi-kernel learning involves alternating updates of projection matrices and gating weights for composite Hilbert space distances, often using trace-ratio criteria (Wang et al., 2019).
Isometric immersion and isometry loss: Neural network models integrate geometric losses enforcing local isometry between reconstructed data distances and pulled-back latent distances, leading to low-distortion embeddings (Chen et al., 23 Sep 2024).

4. Empirical Evidence and Impact

Empirical results across numerous studies demonstrate that Riemannian metric learning:

Yields statistically significant improvements in classification, clustering, domain adaptation, and trajectory inference over fixed-metric and Euclidean baselines (Vemulapalli et al., 2015, Jawanpuria et al., 16 Sep 2024, Zadeh et al., 2016, Collas et al., 2022).
Recovers local anisotropic structure and provides interpretable, data-adaptive metrics, e.g., via geometric means or learned pullback tensors (Scarvelis et al., 2022, Perraul-Joncas et al., 2013).
Achieves top-1 Nearest Neighbor classification accuracy gains of 5–20% and clustering accuracy improvements of 15–30% over standard metrics; specific tasks include face matching (LFW: 69.37% vs. 60.43%), domain adaptation (Caltech-Office: 83.5% vs. 81.9–82.0%), and single-cell RNA population transport (Wasserstein error reduction by 32–46%) (Vemulapalli et al., 2015, Jawanpuria et al., 16 Sep 2024, Sun et al., 16 Oct 2024).
Enables robust modeling in high dimensions and under label noise; RGML-Tyler and surrogate conformal metrics maintain performance with contaminated or sparse data (Collas et al., 2022, Arvanitidis et al., 2021).
Performance gains are sustained across real and simulated datasets; improvements are observed with minimal computational overhead using matrix mean or diagonal-adaptive variants (Zadeh et al., 2016, Chen et al., 2023).
Novel architectures allow isometric neural immersions, achieving an order-of-magnitude reduction in distortion loss compared to prior manifold models (Chen et al., 23 Sep 2024).

5. Theoretical Properties and Interpretability

Riemannian metric learning inherits and extends important theoretical guarantees:

Geodesic convexity: Many objectives, including Fréchet means on SPD and trace-based costs, are (strictly) geodesically convex, ensuring unique global minima and stable convergence of Riemannian gradient flows (Collas et al., 2022, Zadeh et al., 2016).
Closed-form analytical structure: In the SPD setting, solutions via geometric means, Cholesky decompositions, or parametric diffeomorphisms allow interpretability and analytical tractability (Zadeh et al., 2016, Chen et al., 2023).
Pullback and pushforward behavior: The geometry of learned representations can be exactly or approximately related to the original data manifold via Jacobian maps, Laplace-Beltrami inversion, and pullback operations; non-parametric estimators repair distortions in unsupervised embeddings (Rozo et al., 7 Mar 2025, Perraul-Joncas et al., 2013).
Volume-based and inverse-volume criteria: Alternative learning principles select metrics that maximize normalized inverse volumes at the data, yielding connections to classic information-theory metrics and density estimation (Lebanon, 2012).
Optimization guarantees: Proximal and primal-dual algorithms deliver proven non-asymptotic convergence on matrix manifolds, even under constraints (Wang et al., 2020).

6. Applications and Extensions

Riemannian metric learning underpins advances across broad domains:

Domain adaptation and optimal transport: Learned ground metrics and metric fields drive improved alignment and transfer between disparate distributions (Jawanpuria et al., 16 Sep 2024, Scarvelis et al., 2022).
Generative models and autoencoders: Geometry-aware decoders, surrogate conformal latent metrics, and neural pullbacks ensure synthesis conforms to embedded manifolds, supporting interpolation, sampling, and clustering (Rozo et al., 7 Mar 2025, Sun et al., 16 Oct 2024, Arvanitidis et al., 2021).
Manifold and metric estimation: Graph constructions combined with Laplacian theory recover local metrics, supporting geometric tasks such as area estimation, geodesic pathfinding, and coordinate chart construction (Perraul-Joncas et al., 2013).
Multi-modal and set-based classification: Metric learning on fused kernels and multi-manifold representations supports video-based recognition, object categorization, and image set analysis (Wang et al., 2019, Huang et al., 2016).
Robustness and noise tolerance: Formulations that estimate per-class covariances, adapt metric curvature, or use scale-invariant estimators exhibit strong resistance to label and model misspecification (Collas et al., 2022).

7. Future Directions and Open Problems

Despite rapid advances, Riemannian metric learning faces formidable open challenges:

Parametrizing general metric tensor fields $\mathcal{G}_M$ : Trade-offs exist among expressivity, smoothness, and computational tractability; low-rank, spline, and neural parameterizations remain active research areas (Gruffaz et al., 7 Mar 2025).
Computational efficiency: Exact geodesic computation, log/exp map evaluation, and large-scale optimization on non-Euclidean manifolds cost substantially more than Euclidean analogs, demanding further algorithmic innovation (Gruffaz et al., 7 Mar 2025, Chen et al., 2023).
Statistical generalization and sample complexity: Theoretical analysis for non-convex, high-dimensional, or function-space settings is limited; bridging to learning theory is ongoing (Gruffaz et al., 7 Mar 2025).
Scalability and practical integration: Extending metric learning methods to massive datasets, high curvature, cross-modal, or non-Euclidean ambient spaces is an active area of research (Chen et al., 23 Sep 2024, Sun et al., 16 Oct 2024).
Unification of geometric features: Modern workflows combine metric learning with curvature, connection, and topology estimation, raising further questions about the generalization and interpretability of learned geometric structures.

Riemannian metric learning thus constitutes a fundamental generalization of data-driven geometry, providing the theoretical and algorithmic infrastructure for advanced machine learning on curved, structured, or otherwise non-Euclidean data (Gruffaz et al., 7 Mar 2025).