Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 72 tok/s

Gemini 2.5 Pro 41 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 115 tok/s Pro

Kimi K2 203 tok/s Pro

GPT OSS 120B 451 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

Riemannian Optimization: A Geometric Approach

Updated 5 August 2025

Riemannian optimization is a mathematical framework that extends classical methods to smooth manifolds by using geodesics and tangent spaces.
It employs Riemannian counterparts of gradient descent, Newton, and conjugate gradient methods to navigate curved spaces efficiently.
These frameworks enable practical solutions in machine learning, signal processing, and shape optimization through tailored metrics and preconditioning.

Riemannian optimization frameworks are mathematical and algorithmic structures that exploit the differential geometric properties of smooth manifolds to design, analyze, and implement optimization algorithms in non-Euclidean spaces. These frameworks generalize classical optimization—usually developed in Euclidean vector spaces—by replacing linear structures with geodesics, tangent spaces, and Riemannian metrics. As a result, Riemannian optimization accommodates constraints expressed as manifold conditions (such as orthogonality, fixed rank, or unit norm), and enables the development of both first-order and advanced second-order methods suitable for a broad spectrum of applications in numerical linear algebra, machine learning, signal processing, and optimal design.

1. Mathematical Foundations of Riemannian Optimization

Riemannian optimization relies on the concept of a Riemannian manifold $(\mathcal{M}, g)$ , where $\mathcal{M}$ is a smooth manifold and $g$ is a smoothly varying inner product (the Riemannian metric) on each tangent space $T_p\mathcal{M}$ . For a smooth function $f: \mathcal{M} \to \mathbb{R}$ , the Riemannian gradient $\mathrm{grad} f(p)$ is defined as the unique tangent vector in $T_p\mathcal{M}$ satisfying $g_p(\mathrm{grad} f(p), v) = \mathrm{D} f(p)[v]$ for all $v \in T_p\mathcal{M}$ , where $\mathrm{D} f(p)[v]$ is the directional derivative. The Riemannian Hessian is the covariant derivative of the gradient, incorporating curvature via the Levi–Civita connection. Updates are performed not by linear addition, but by mapping steps taken in the tangent space back onto the manifold using the exponential map or a retraction operator.

This formalism extends to infinite-dimensional settings, as in shape optimization, where admissible shapes are points on an infinite-dimensional Riemannian manifold, and optimization functionals are defined over this space (Schulz, 2012).

2. Algorithmic Frameworks and Core Techniques

Riemannian frameworks generalize classical methods (gradient, Newton, and conjugate gradient) by replacing Euclidean notions with their Riemannian counterparts:

Steepest Descent: Updates are performed as $p_{i+1} = \exp_{p_i}(-\alpha_i \mathrm{grad} f(p_i))$ , where $\exp_{p}$ is the exponential map and $\alpha_i$ is obtained by geodesic line search (Smith, 2014).
Newton's Method: The direction $H_i$ solves $(\nabla^2 f)_{p_i} (H_i) = -\mathrm{grad} f(p_i)$ , leading to updates $p_{i+1} = \exp_{p_i}(H_i)$ , where $\nabla^2 f$ is the Riemannian Hessian. Quadratic or even cubic convergence is possible in certain contexts (e.g., Rayleigh quotient minimization) (Smith, 2014).
Conjugate Gradient: Directions are updated by $H_{i+1} = G_{i+1} + \gamma_i T H_i$ , where $G_{i+1}$ is the negative gradient, $T$ is parallel transport, and $\gamma_i$ ensures conjugacy with respect to the Hessian.

Algorithms intrinsically exploit manifold geometry using geodesics, parallel transport, and Riemannian connections, and avoid reliance on ambient Euclidean embeddings. The Riemannian SVRG and related variance-reduced methods further adapt stochastic techniques to manifolds, requiring careful use of parallel transport and curvature-aware step-size rules (Zhang et al., 2016, Demidovich et al., 11 Mar 2024).

3. Preconditioning, Metrics, and Acceleration

Metric choice is essential in Riemannian frameworks, as it determines the geometry, conditioning, and convergence properties of the algorithms (Gao et al., 2023). Preconditioned metrics are formed by incorporating self-adjoint, positive-definite operators $A_k(x)$ that approximate diagonal blocks of the Riemannian Hessian. For product manifolds $\mathcal{M} = \mathcal{M}_1 \times \dots \times \mathcal{M}_k$ , the metric is constructed as

$g_x(\xi, \eta) = \sum_{k=1}^K \langle \xi_k, A_k(x)[\eta_k] \rangle,$

allowing custom tailoring via block-diagonal (exact or approximate), left/right, or Gauss–Newton preconditioning. These choices dramatically reduce condition numbers and iteration counts in applications such as canonical correlation analysis and truncated SVD.

Acceleration techniques incorporate variational principles and time-adaptive, symplectic integrators. By formulating optimization as the discretization of Bregman Lagrangian or Hamiltonian flows on manifolds, variational integrators can achieve $\mathcal{O}(1/t^p)$ continuous rates and explicit methods with improved numerical stability and computational efficiency (Duruisseaux et al., 2021, Duruisseaux et al., 2021, Duruisseaux et al., 2022).

4. Stochastic, Adaptive, and Privacy-Preserving Extensions

Riemannian optimization frameworks support rich stochastic and adaptive variants:

Variance-Reduced Methods: RSVRG and loopless methods (R-LSVRG, R-PAGE) reduce variance by probabilistically updating full gradients, employing parallel transport for tangent vectors, and achieve state-of-the-art convergence in nonconvex and distributed settings (Zhang et al., 2016, Demidovich et al., 11 Mar 2024).
Adaptive Methods: Extensions of AdaGrad, RMSProp, Adam, and AMSGrad to manifolds employ per-coordinate scaling in the tangent space and construct momentum and scaling terms compatible with the manifold geometry. These methods, combined with mini-batching, admit rigorous convergence guarantees and improved practical performance for large-scale problems (Sakai et al., 1 Sep 2024).
Differential Privacy: Intrinsic Gaussian noise is added to tangent space gradients, achieving strong privacy guarantees with sensitivity bounds and utility analytically dependent on the manifold's curvature and dimension. This is critical in applications such as differentially private PCA and Fréchet mean computation (Han et al., 2022, Utpala et al., 2022).

5. Practical Implementations and Algorithmic Toolkits

Several open-source toolkits embody Riemannian optimization frameworks. Geoopt for PyTorch (Kochurov et al., 2020), TensorFlow RiemOpt (Smirnov, 2021), and Rieoptax for JAX (Utpala et al., 2022) support efficient manifold representations, provide manifold interfaces (defining retraction, exponential map, vector transport, and gradient conversions), and integrate seamlessly with standard ML pipelines. These packages accommodate spheres, Stiefel, Grassmann, SPD, Lie groups, hyperbolic models, and product/quotient manifolds, enabling plug-and-play geometry-aware layers and optimizers in deep neural networks.

The toolkits support a wide range of algorithms, from basic RSGD to advanced adaptive and variance-reduced methods, and maintain compatibility with parallel computing backends and automatic differentiation.

6. Applications, Extensions, and Modeling Paradigms

Riemannian optimization frameworks underpin a diverse set of application domains:

Eigenproblems and Matrix Decomposition: Efficient recovery of extreme eigenvectors via Rayleigh quotient minimization on spheres, generalized eigenproblems on Stiefel manifolds, Procrustes analysis, and low-rank matrix optimizations (Smith, 2014, Gao et al., 2023).
Shape Optimization: Shape–Newton methods leveraging a Riemannian shape Hessian yield symmetric operators and quadratic convergence for domain optimization tasks constrained by PDEs and boundary shape (Schulz, 2012).
Sparse/Low-Rank Modeling: Frameworks for index coding leverage quotient manifold geometry of fixed-rank matrices to jointly promote sparsity and low-rankness, characterizing tradeoffs between storage and transmission rates (Shi et al., 2016).
Learning and Deep Models: Geometry-aware architectures are developed for LLMs, hyperbolic embeddings, and vision transformers. Manifold constraints enhance representation capacity, optimization stability, and generalization (Kochurov et al., 2020, Utpala et al., 2022, Bogachev et al., 16 Jul 2025).
Bilevel, Minimax, and Distributed Optimization: Intrinsic hypergradient methods apply to bilevel problems with manifold constraints at both levels, and alternating descent–ascent frameworks solve nonconvex-linear minimax problems, achieving optimal iteration complexities for $\varepsilon$ -stationary points (Han et al., 6 Feb 2024, Xu et al., 29 Sep 2024).
Scalable and Data-Driven Riemannian Methods: Randomized submanifold updates (e.g., RSDM) and manifold-free methods based on moving least-squares facilitate scalability and make Riemannian optimization accessible in large or data-defined constraint scenarios (Han et al., 18 May 2025, Shustin et al., 2022).

7. Open Challenges and Future Directions

Active areas of extension within Riemannian optimization frameworks include:

Metric Selection and Preconditioning: Improved analysis and design of curvature-adaptive metrics for further acceleration.
Higher-order and Composite Optimization: Variational integrators for nonsmooth or composite objectives, and further generalizations to more complex geometries.
Interplay with Deep Learning: Embedding Riemannian adaptive methods into large-scale neural architectures, and automated manifold discovery from data.
Privacy, Robustness, and Federated Learning: Enhanced privacy analysis for manifold-valued learning, robust communication in distributed Riemannian optimization, and extensions to non-Euclidean federated learning scenarios.

The evolving landscape of Riemannian optimization frameworks continues to integrate geometric insights, stochastic methods, and scalable computational techniques, providing rigorous foundations and practical tools for modern optimization in structured, curved, or otherwise non-Euclidean spaces.