Papers
Topics
Authors
Recent
Search
2000 character limit reached

Riemannian NML: Model Selection on Manifolds

Updated 6 April 2026
  • Riemannian NML is a framework that extends the NML coding scheme to data on Riemannian manifolds, offering a coordinate-invariant measure of stochastic complexity.
  • It leverages the invariant properties of the Riemannian volume element and Fisher information to ensure consistency under smooth coordinate transformations.
  • The framework reduces to standard NML in Euclidean spaces, enabling practical applications in hyperbolic Gaussian modelling for network and hierarchical data.

Riemannian Normalized Maximum Likelihood (Rm-NML) is an extension of the Normalized Maximum Likelihood (NML) universal coding and model selection framework to statistical models where the data space is a Riemannian manifold. It provides a coordinate-invariant, geometrically consistent notion of stochastic complexity and regret minimization in non-Euclidean settings. Rm-NML recovers the conventional NML code-length in Euclidean spaces and enables model selection and information-theoretic analysis for data distributed on general manifolds such as hyperbolic spaces, which are of growing interest in graph and hierarchical data modeling (Fukuzawa et al., 29 Aug 2025).

1. Formal Definition and Distribution Construction

Let (M,g)(\mathcal{M}, g) be a DD-dimensional Riemannian manifold with metric gg and induced volume element dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx. Consider a parametric family of model densities $p_\operatorname{vol}(x|\theta)$, defined with respect to dvol(x)d\operatorname{vol}(x), for xMx \in \mathcal{M}, θΘ\theta \in \Theta. The maximum-likelihood estimator is

$\hat\theta(x) = \arg\max_\theta p_\operatorname{vol}(x|\theta).$

The Rm-NML distribution is then

$p_{\rm Rm\text{-}NML}(x) = \frac{ p_\operatorname{vol}(x|\hat\theta(x)) }{ \displaystyle \int_\mathcal{M} p_\operatorname{vol}(y|\hat\theta(y))\,d\operatorname{vol}(y) }.$

The associated code-length is

DD0

This framework naturally generalizes the Shtarkov NML code to any manifold equipped with a Riemannian measure.

2. Coordinate Invariance and Role of the Fisher Information

The Rm-NML code-length is invariant under smooth coordinate transformations due to two geometric properties:

  • The volume element DD1 transforms contravariantly, ensuring the scalar nature of DD2.
  • In asymptotic normalizing constant calculations, the Fisher information metric DD3 on DD4 yields the Jeffreys prior DD5, which is itself invariant under reparameterization.

Explicitly, if DD6 and DD7 are smooth coordinate charts on DD8: DD9 reflecting coordinate invariance in the parameter space. This guarantees that Rm-NML is well-defined and interpretable regardless of coordinate representation.

3. Reduction to Ordinary NML in Euclidean Spaces

If gg0 with the standard Euclidean metric, gg1 and gg2, so the Rm-NML reduces exactly to conventional NML: gg3 matching the original Shtarkov code-length and ensuring compatibility with classical minimum description length (MDL) theory.

4. Asymptotic and Computational Properties

For sample size gg4 and under standard regularity conditions, the normalizing constant gg5 can be approximated (via saddle-point asymptotics) as: gg6 where gg7 and gg8 is the Fisher information: gg9 This result replaces Lebesgue measure with the manifold volume and incorporates explicit chart transformations, mirroring the derivation of Rissanen (1996) for stochastic complexity.

5. Riemannian Symmetric Spaces and Simplifications

When dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx0 is a Riemannian symmetric space and the density dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx1 depends on dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx2 only via their geodesic distance dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx3 plus Euclidean nuisance parameters dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx4, the Fisher information matrix exhibits a block-diagonal structure: dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx5 where dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx6 and dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx7 are independent of dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx8 by manifold homogeneity. Consequently,

dvol(x)=detg(x)dxd\operatorname{vol}(x) = \sqrt{\det g(x)}\,dx9

In this setting, integration over the data manifold is replaced by the volume of the parameter manifold, simplifying computation substantially.

6. Explicit Hyperbolic Gaussian Case

For $p_\operatorname{vol}(x|\theta)$0-dimensional hyperbolic space $p_\operatorname{vol}(x|\theta)$1 (curvature $p_\operatorname{vol}(x|\theta)$2), the Riemannian Gaussian (R-GD) density is defined as: $p_\operatorname{vol}(x|\theta)$3 with the normalizing factor

$p_\operatorname{vol}(x|\theta)$4

For data $p_\operatorname{vol}(x|\theta)$5, the MLEs are the Riemannian–Fréchet mean $p_\operatorname{vol}(x|\theta)$6 and variance estimator $p_\operatorname{vol}(x|\theta)$7. The fitted log-likelihood is

$p_\operatorname{vol}(x|\theta)$8

Applying the symmetric-space formula, the Rm-NML normalizing constant (Corollary 6.1 (Fukuzawa et al., 29 Aug 2025)) is

$p_\operatorname{vol}(x|\theta)$9

where dvol(x)d\operatorname{vol}(x)0 is a geodesic ball volume (for restricted dvol(x)d\operatorname{vol}(x)1), and

dvol(x)d\operatorname{vol}(x)2

The full Rm-NML code-length for hyperbolic Gaussian models is then

dvol(x)d\operatorname{vol}(x)3

7. Practical Computation and Applications

The remaining integrals in dvol(x)d\operatorname{vol}(x)4—over Euclidean parameters or geodesic ball volumes—are amenable to numerical quadrature, Monte Carlo, or, in the Euclidean limit, Fourier methods. Optimization for the Fréchet mean on dvol(x)d\operatorname{vol}(x)5 is addressed by Riemannian gradient descent (Bonnabel 2013), while dvol(x)d\operatorname{vol}(x)6 has a closed-form estimator.

Once implemented, the Rm-NML framework enables fully coordinate-invariant model selection, regret minimization, and MDL-based coding on manifold-valued data. Notably:

  • For hierarchical data or graph embeddings in hyperbolic space, hyperbolic-Gaussian Rm-NML enables selection of both embedding dimension dvol(x)d\operatorname{vol}(x)7 and curvature (via the geodesic radius dvol(x)d\operatorname{vol}(x)8), fully respecting the underlying data geometry.
  • The framework generalizes to any manifold admitting a Riemannian structure, supporting applications where geometric structure is intrinsic to the data.

A plausible implication is that Rm-NML facilitates rigorous model selection and information-theoretic analyses for emerging applications in geometric deep learning, network analysis, and manifold-based statistical inference, particularly where non-Euclidean geometries are required (Fukuzawa et al., 29 Aug 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Riemannian NML.