Curvature-Adaptive Methods Overview

Updated 23 November 2025

Curvature-adaptive methods are techniques that dynamically estimate local curvature to tailor computations in optimization, learning, and simulation.
They improve efficiency and accuracy by adapting parameters like step size, depth, and sampling to the intrinsic geometry of the problem.
These methods have broad applications across graph neural networks, mesh processing, and deep geometric models, resulting in enhanced performance and stability.

A curvature-adaptive method is any algorithmic approach in which geometric, analytic, or combinatorial curvatures are estimated in situ and then exploited adaptively to drive optimization, learning, simulation, or representation. Curvature-adaptive methods arise in diverse fields, uniting techniques from differential geometry, stochastic optimization, geometric deep learning, numerical PDEs, mesh generation, and data-driven modeling. Across these domains, adaptivity to curvature enables algorithms to localize computation, capture multi-scale structure, improve efficiency, and achieve better alignment with the underlying geometry or information flow of the problem.

1. Curvature-Adaptive Methods in Graph Representation Learning

Explicit curvature-adaptivity in graph neural architectures is realized by jointly estimating and optimizing the curvature parameters of the embedding spaces in which message passing and node representations are performed.

Hyperbolic Curvature Adaptation:

Conventional hyperbolic graph neural networks (HGNNs) embed graphs in negatively curved Lorentzian or Poincaré models with a user-fixed curvature $K$ (or characteristic scale $\zeta$ ). This is suboptimal for real-world graphs exhibiting heterogeneous or hierarchical topologies.

ACE-HGNN (Fu et al., 2021) operationalizes curvature-adaptivity via a two-agent reinforcement learning protocol:

The ACE-Agent proposes per-layer curvatures $\zeta_\ell$ at each epoch, estimating suitable curvature based on embeddings from the prior epoch using sampled geodesic triangles.
The HGNN-Agent trains node representations in the corresponding hyperbolic manifold, feeding forward under the proposed $\zeta_\ell$ and backpropagating task loss.
Both agents update their strategies via Nash Q-learning, dynamically driving the ensemble toward a Nash equilibrium that maximizes downstream metrics like F1 or ROC-AUC.

Empirically, ACE-HGNN learns geometry-aligned, graph-dependent curvatures: more tree-like (low $\delta$ -hyperbolicity) graphs yield more negative $K$ (small $\zeta$ ), whereas Euclidean-like graphs push $\zeta$ higher. The approach reduces embedding distortion and improves classification and link prediction performance over fixed-curvature and Euclidean GNNs.

Mixed-Curvature Adaptation:

AMCAD (Xu et al., 2022) generalizes curvature-adaptivity to arbitrary mixtures of (negative, zero, positive) constant-curvature spaces. Each node is embedded in a product space $\mathcal{U} = \bigtimes_{m} \mathbb{U}^d_{\kappa_m}$, with curvatures $\kappa_m$ learned end-to-end. Edge-level scoring exploits attentive projections onto edge- and relation-specific curvature subspaces, fusing multiple geometry-specific distances via attention. Curvature adaptivity is essential for capturing the heterogeneity of large-scale industrial heterogeneous graphs, yielding lower distortion and improved ad retrieval metrics.

Curvature-Guided Depth Adaptation in GNNs:

Bakry–Émery curvature (Hevapathige et al., 3 Mar 2025) provides a node-level measure reflecting both local topology and diffusion dynamics. The Depth-Adaptive GNN simultaneously learns $\hat\kappa(x)$ for each node and dynamically determines the number of propagation steps at each vertex using a curvature-ranked stopping rule:

$T(x) = \min\left\{ t \in \mathbb{N} : \frac{1}{|V|} \sum_{y \in V} \mathbf{1}\{ \hat\kappa(y) \geq \hat\kappa(x) \} \leq \frac{k t}{100} \right\}$

High-curvature nodes, losing distinctiveness quickly, halt message passing early; low-curvature nodes propagate deeper, thereby preventing oversmoothing and enhancing feature expressiveness. This method achieves substantial gains in node and graph classification, especially in heterophilic graphs.

2. Curvature-Adaptive Optimization Algorithms

Curvature-adaptive techniques in optimization dynamically estimate local curvature and exploit it to adjust search direction, adapt step size, and modulate batch/sample size for convergence speed and stability.

Negative Curvature and Adaptive Sampling:

Berahas et al. (Berahas et al., 15 Nov 2024) propose a two-step method for noisy nonconvex optimization, combining:

Negative-curvature steps: If the Hessian has a negative eigenvalue, a step is taken along the corresponding direction, leveraging inexact conjugate gradient (CG) with negative curvature detection and early stopping.
Descent steps: Standard gradient-based updates, curvature-corrected by implicitly preconditioning via the low-rank structure.
Adaptive sampling: The sample sizes for gradient and Hessian estimation are chosen to control the variance so as to enforce descent and ensure second-order stationarity. Theoretical guarantees yield $O(\epsilon_g^{-2} + \epsilon_H^{-3})$ complexity, both in expectation and deterministically.

Curvature-Adaptive Proximal Gradient:

Malitsky & Mishchenko (Malitsky et al., 2023) develop a curvature-adaptive proximal gradient algorithm for convex optimization. At each iteration, the local Lipschitz constant is estimated by

$L_k = \frac{\|\nabla f(x^k) - \nabla f(x^{k-1})\|}{\|x^k - x^{k-1}\|}$

and the step size $\alpha_k$ is dynamically chosen using a root test to ensure convergence without knowledge of global Lipschitzness. This approach enables larger steps in flatter regions and O $(1/k)$ convergence under local smoothness assumptions, with no extra gradient or prox evaluations per iteration.

Curvature-Adaptive Stochastic Gradient Methods:

The AdaSecant (Gulcehre et al., 2014, Gulcehre et al., 2017) methodology estimates per-coordinate curvature via secant approximations based on local gradient statistics:

$\eta_i^k \approx \frac{\sqrt{E_k[(\Delta_i^k)^2]}}{\sqrt{E_k[(\alpha_i^k)^2]} - E_k[\alpha_i^k\Delta_i^k]/E_k[(\alpha_i^k)^2]}$

where $\alpha_i^k$ are finite differences of stochastic gradients. Combined with variance-reduced gradients, this yields per-parameter learning rates that automatically adapt to local (potentially highly anisotropic) curvature, outperforming hand-tuned learning-rate schedules on various deep nets and convex formulations.

Dynamic Sampling SGD:

DS-SGD (Bahamou et al., 2019) adaptively grows mini-batch size and computes directional curvature via Hessian-vector products, using two statistical tests (acute-angle and curvature-variance) to trigger sample-size increase. An analytical step length $t_{k, \epsilon}$ is computed that incorporates curvature estimates, ensuring robust progress even under model or data nonstationarity.

Curvature-Adaptive Optimization via Periodic Low-Rank Hessian Sketching:

CAO (Du, 16 Nov 2025) injects periodic block-Lanczos sketches of the Hessian to build low-rank preconditioners; preconditioning is applied only in the sketched subspace, reducing the cost compared to full-matrix approaches. This yields widened stable stepsize ranges and accelerated convergence, especially in sharp, anisotropic regimes of non-convex objectives.

3. Curvature Adaptivity in Discrete Geometry and Mesh Processing

Curvature-adaptive methods govern spatial sampling, mesh density, integration step size, or resource allocation, leveraging geometric curvature computed directly on discrete domains.

Curvature-Adaptive Mesh Simplification:

Seemann et al. (Seemann et al., 2016) introduce a scheme for estimating multi-scale mean curvature fields on triangle meshes using ball neighborhoods. Each vertex $v$ is assigned an optimal radius $r_v$ at which the mean curvature $H(v, r_v)$ stabilizes, determined by an automated criterion based on the relative change in curvature. This per-vertex, scale-adaptive field is mapped to a sampling density $\rho(v)$ , ensuring high curvature regions (sharp features, creases) receive higher fidelity during simplification. The approach outperforms uniform-density strategies in feature preservation and sampling efficiency.

Curvature-Adaptive High-Order Mesh $r$ -Adaption:

Aparicio-Estrems et al. (Aparicio-Estrems et al., 2023) perform globally coupled mesh movement by interpolating high-order metric fields (encoding, e.g., solution Hessians, boundary curvature) via log-Euclidean interpolants on curved background meshes. The gradient and Hessian of a combined distortion+geometry loss are computed using differential and eigenstructure calculus, enabling second-order Newton optimization that aligns mesh elements pointwise to local geometry and metric curvature.

Curvature-Guided Time-Step Adaptation in Dynamics:

Curvature-adaptive time integration (Lages et al., 2013) employs the first Frenet curvature of the solution trajectory (displacement history) to select integration time steps. Specifically, the discrete curvature

$k_{n+1} = \sqrt{\frac{(1 + \dot{\mathbf{d}}_{n+1}^\mathsf{T} \dot{\mathbf{d}}_{n+1})(\ddot{\mathbf{d}}_{n+1}^\mathsf{T} \ddot{\mathbf{d}}_{n+1}) - (\dot{\mathbf{d}}_{n+1}^\mathsf{T} \ddot{\mathbf{d}}_{n+1})^2}{(1 + \dot{\mathbf{d}}_{n+1}^\mathsf{T} \dot{\mathbf{d}}_{n+1})^3}}$

is used in an exponential rule to decrease the time step in regions of high kinematic curvature, ensuring dynamic resolution concentrates where needed. This method yields improved solution accuracy for a given computational cost, with negligible overhead.

4. Learning and Leveraging Curvature Adaptation in Deep Geometric Models

Emergent applications span geometry-aware deep learning, where curvature adaptivity is used within architectures to align model expressivity with the non-Euclidean latent structure of the data.

Curvature-Adaptive Transformers:

CAT (Lin et al., 2 Oct 2025) learns per-token routing weights across parallel Euclidean, hyperbolic, and spherical attention branches. Routing is realized via a lightweight, MLP-based softmax gating mechanism producing interpretable curvature-preference distributions per token. Each branch applies geometry-consistent attention operations, with hyperbolic and spherical branches utilizing the appropriate exponential and logarithmic maps, and pairwise distances. The hyperbolic curvature $c$ is treated as a learnable parameter, entering into all relevant manifold operations. The architecture achieves higher mean reciprocal rank (MRR) and Hits@10 on knowledge graph benchmarks compared to any single-geometry baseline, confirming the benefit of mixture-of-geometry attention.

Curvature-Adaptive Deep Point Cloud Upsampling:

CAD-PU (Lin et al., 2020) distributes upsampled points in proportion to local surface curvature, minimizing overall approximation error for a fixed point budget. A trainable feature-expansion module samples and expands features weighted by estimated local curvature, and discrete curvature surrogates are used in a regularizer minimizing average curvature over the upsampled set. Empirically, this yields improved preservation of geometric detail and surface accuracy under both synthetic and real scanning settings.

5. Curvature Adaptivity in Markov Processes and Curve Evolution

Adaptive Markov Chains via Curvature:

Pillai & Smith (Pillai et al., 2013), building on Ollivier's discrete Ricci curvature, establish that positive curvature contraction of Markov kernels under Wasserstein metrics leads to finite-sample concentration inequalities for adaptively updated MCMC samplers. Explicit bounds relate the empirical distribution of adaptive chains to their target, with improved mixing and accuracy over classical Metropolis–Hastings or parallel tempering after an initialization phase. This perspective unites ergodicity, adaptation, and finite-sample control under curvature-based contractivity.

Curvature-Adjusted Curve Evolution:

The method of Ševčovič–Yazaki (Sevcovic et al., 2010) introduces a curvature-adaptive tangential velocity to control the redistribution of discrete points along evolving curves. A parabolic flow for the logarithm of the “relative local length” weighted by a shape function of curvature is coupled with the classical normal-velocity evolution PDEs. The resulting scheme clusters points in high curvature regions and sparsifies in flatter regions, achieving optimal approximation of geometric quantities (length, area) for a given number of discretization points.

Curvature-Prior Geodesic Path Computation:

A curvature-prior elastica model (Chen et al., 2023) introduces a data-driven, spatially varying curvature field $\omega(x,\theta)$ into the Euler–Mumford energy. The resulting Hamilton–Jacobi–Bellman PDE is discretized using adaptive finite differences, with stencils explicitly constructed to align with $\omega$ . The viscosity solution is computed via generalized Fast Marching, yielding globally optimal geodesic paths that faithfully track high-curvature structures in images or other signals.

6. Synthesis: Principles and Impact

Curvature-adaptive methods share common algorithmic motifs:

Local estimation or learning of curvature (or analogous second-order geometric signals).
Adaptive control of computation (step size, depth, sampling, mesh density, operator selection) ruled by curvature.
Coupling of theoretical or empirical performance guarantees to the correctness of curvature adaptation, often leading to improved efficiency, stability, or representational fidelity.

Across deep learning, optimization, simulation, and geometric computing, curvature-adaptive methodologies provide a unified paradigm for exploiting the local-to-global structure of the problem, enabling models and algorithms to adapt their complexity and capacity to the intrinsic geometry of the underlying data or task.

Table: Highlights of Curvature-Adaptive Methods across Domains

Domain	Main Method	Curvature Type / Use
Hyperbolic GNNs	ACE-HGNN	RL-driven, layerwise negative curvature tuning
Mixed-Geometry GNNs	AMCAD	Learnable mixture of constant curvature spaces
GNN Depth Control	Depth-Adapt. GNN	Learned Bakry–Émery curvature for depth
Convex Optimization	AdProxGD	Local Lipschitz/curvature step size
Nonconvex Opt.	CAO, DS-SGD, NCAS	Hessian-based, low-rank or samplewise curvature
Deep Learning	AdaSecant	Per-parameter secant curvature for learning rate
Mesh Simplification	Multi-scale fields	Per-vertex mean curvature, adapt. radius
Mesh $r$ -adaption	Log-Euclidean high-p	Metric curvature for node movement
Time Integration	Curvature estimator	Trajectory curvature for time step control
Markov Chains / MCMC	Ricci, Bakry–Émery	Curvature contraction for concentration
Curve Evolution	PDE point redistribution	Tangential velocity coupled with curvature
Deep Geometric Models	CAT, CAD-PU	Routing/expansion directed by curvature
Image Analysis	Curvature-prior elastica	Curvature field prior in PDE-HJB geodesics

This corpus demonstrates that curvature adaptivity is a versatile instrument, unifying geometric localization and data-dependent control for performance gains across a spectrum of modern computational fields.