Papers
Topics
Authors
Recent
2000 character limit reached

Fisher–Rao Path Length Overview

Updated 22 December 2025
  • Fisher–Rao Path Length is defined as the intrinsic Riemannian distance between probability distributions, quantifying the minimal statistical path using the Fisher information metric.
  • It offers both closed-form and numerical methods for computing geodesics and distances across various models including discrete, exponential, and Gaussian families.
  • The metric underpins applications in information geometry and statistical mechanics, aiding in entropy minimization and efficient state transport.

The Fisher–Rao path length is the intrinsic Riemannian length between probability measures or statistical models, induced by the Fisher information metric on the space of densities, probability vectors, or parametric distributions. This length quantifies the minimal “statistical distance” required to traverse a smooth path between two distributions and underpins statistical geometry, information theory, and related fields. The Fisher–Rao metric gives rise to a rich geometric structure on the space of probability measures, with explicit analytic, variational, and dynamical representations, supporting both closed-form and numerical evaluation in various settings.

1. Fisher–Rao Metric on Probability Spaces

The Fisher–Rao metric is defined as the Riemannian metric on the manifold of probability measures or densities. For a strictly positive density ρ\rho on a (possibly infinite-dimensional) domain MM,

ζ1,ζ2FR,ρ=Mζ1(x)ζ2(x)ρ(x)dx\langle \zeta_1, \zeta_2 \rangle_{\mathrm{FR}, \rho} = \int_M \frac{\zeta_1(x)\, \zeta_2(x)}{\rho(x)} \, dx

where ζ1,ζ2\zeta_1, \zeta_2 are tangent vectors, typically (for the statistical manifold of densities) functions a(x)a(x) satisfying Ma(x)dx=0\int_M a(x) dx = 0 (Chizat et al., 2015, Bauer et al., 2023, Diósi, 5 Oct 2024).

On the probability simplex Δm\Delta_m of discrete distributions p=(p0,...,pm)p = (p_0, ..., p_m), the Fisher–Rao metric has ambient form

ds2=i=0m(dpi)2pids^2 = \sum_{i=0}^m \frac{(dp_i)^2}{p_i}

with the constraint ipi=1\sum_i p_i = 1 enforced on the tangent space (Zhang, 6 Aug 2025).

For parametric statistical models {pθ(x)}\{p_\theta(x)\}, the Fisher information matrix defines the Fisher–Rao metric: gij(θ)=Eθ[θilogpθ(x)θjlogpθ(x)]g_{ij}(\theta) = \mathbb{E}_\theta \left[ \partial_{\theta^i} \log p_\theta(x) \, \partial_{\theta^j} \log p_\theta(x) \right] (Mielke, 2 Oct 2025, Nielsen, 15 Mar 2024).

2. Path Length and Geodesic Formulation

Given a curve of distributions tρtt \mapsto \rho_t, the Fisher–Rao path length over [0,1][0,1] is

LFR[ρ]=01(M[tρt(x)]2ρt(x)dx)1/2dtL_{\mathrm{FR}}[\rho_{\cdot}] = \int_0^1 \left( \int_M \frac{[\partial_t \rho_t(x)]^2}{\rho_t(x)} dx \right)^{1/2} dt

(Chizat et al., 2015, Bauer et al., 2023). For parametric models, the length functional becomes

L[θ()]=01θ˙i(t)gij(θ(t))θ˙j(t)dtL[\theta(\cdot)] = \int_0^1 \sqrt{ \dot{\theta}^i(t) \, g_{ij}(\theta(t))\, \dot{\theta}^j(t) }\, dt

and geodesics are the solution to the Euler–Lagrange equations

θ¨k+Γijk(θ)θ˙iθ˙j=0\ddot{\theta}^k + \Gamma^k_{ij}(\theta)\, \dot{\theta}^i\dot{\theta}^j = 0

where Γijk\Gamma^k_{ij} are the Christoffel symbols of the Fisher–Rao metric (Mielke, 2 Oct 2025, Bauer et al., 2023, Nielsen, 15 Mar 2024).

On spherical or simplex models, this leads to geodesics corresponding to constant-speed curves on the sphere; on hyperbolic or exponential-family models, geodesics are related to constant-speed curves in the Poincaré plane or related symmetric spaces (Miyamoto et al., 2023).

3. Closed-Form Geodesics and Distance Formulae

In several prominent cases, the Fisher–Rao geodesic and its length can be given in closed-form. On the space of smooth probability densities on a manifold MM, the “square-root” mapping f=ρf = \sqrt{\rho} embeds the space isometrically onto the unit sphere in L2(M)L^2(M): dFR(ρ0,ρ1)=arccos(Mρ0(x)ρ1(x)dx)d_{\mathrm{FR}}(\rho_0, \rho_1) = \arccos \left( \int_M \sqrt{\rho_0(x) \, \rho_1(x)} dx \right) with the unique constant-speed geodesic ρt(x)=[(1t)ρ0(x)+tρ1(x)]2\rho_t(x) = [ (1-t) \sqrt{\rho_0(x)} + t \sqrt{\rho_1(x)} ]^2 (Bauer et al., 2023, Chizat et al., 2015). Analogous forms apply on the probability simplex, yielding

dFR(p,q)=2arccos(i=0mpiqi)d_{\mathrm{FR}}(p, q) = 2 \arccos \left( \sum_{i=0}^m \sqrt{p_i q_i} \right )

(Miyamoto et al., 2023, Zhang, 6 Aug 2025).

For exponential families or scale/location models, the Fisher–Rao path length reduces to classic hyperbolic metrics: dFR(μ1,σ1;μ2,σ2)=2bharctanh(ah(μ2μ1)2+bh(σ2σ1)2ah(μ2μ1)2+bh(σ2+σ1)2)d_{\mathrm{FR}}(\mu_1, \sigma_1; \mu_2, \sigma_2) = 2\sqrt{b_h} \, \operatorname{arctanh} \left( \sqrt{ \frac{a_h (\mu_2-\mu_1)^2 + b_h (\sigma_2-\sigma_1)^2 }{ a_h (\mu_2-\mu_1)^2 + b_h (\sigma_2+\sigma_1)^2 } } \right) for Fisher metric coefficients ah,bha_h, b_h, which specialize for normal, Laplace, logistic, and Student-tt families (Miyamoto et al., 2023, Mielke, 2 Oct 2025).

On the manifold of zero-mean Gaussian covariances ΣSym+(p)\Sigma \in \mathrm{Sym}^+(p), the geodesic and distance are

Σ(t)=Σ11/2(Σ11/2Σ2Σ11/2)tΣ11/2\Sigma(t) = \Sigma_1^{1/2} \left( \Sigma_1^{-1/2} \Sigma_2 \Sigma_1^{-1/2} \right)^t \Sigma_1^{1/2}

dFR(Σ1,Σ2)=12tr[log2(Σ11/2Σ2Σ11/2)]d_{\mathrm{FR}}(\Sigma_1, \Sigma_2) = \sqrt{ \frac{1}{2} \mathrm{tr} \left[ \log^2(\Sigma_1^{-1/2} \Sigma_2 \Sigma_1^{-1/2}) \right ] }

(Wells et al., 2020), see also (Bauer et al., 2023, Nielsen, 2023, Nielsen, 2023).

4. Dynamic and Variational Characterizations

The path length in Fisher–Rao geometry admits variational and dynamic formulations distinct from optimal transport metrics, notably through the “growth equation”: tμt=ξtμt\partial_t \mu_t = \xi_t \mu_t subject to μ0,μ1\mu_0, \mu_1 prescribed, and the length given by

L[μ]=σ201ξtL2(μt)dtL[\mu] = \frac{\sigma}{2} \int_0^1 \|\xi_t\|_{L^2(\mu_t)} dt

(Mielke, 2 Oct 2025). For curves between densities ρ0,ρ1\rho_0, \rho_1, the Fisher–Rao geodesic is always the geometric mixture

ρt(x)ρ0(x)1tρ1(x)t\rho_t(x) \propto \rho_0(x)^{1-t} \rho_1(x)^t

and the constant FR-speed along this curve yields a length proportional to arccos(ρ0ρ1)\arccos(\int \sqrt{\rho_0 \rho_1}), matching the closed-form solution (Maurais et al., 8 Jan 2024).

5. Computational Methods and Approximations

For nontrivial models without analytic geodesics, the Fisher–Rao path length must be computed numerically. Techniques include piecewise approximation using small-step geodesic distances (locally approximated by the square root of the Jeffreys divergence), shortest-path “Fisher–Manhattan” upper bounds, Calvo–Oller isometric embeddings into higher-dimensional symmetric positive-definite cones, and adaptive subdivisions for multiplicative error guarantees (Nielsen, 2023, Nielsen, 2023, Nielsen, 15 Mar 2024).

In Gaussian and elliptical families, precise lower and upper bounds are derived from SPD-geometry and Birkhoff/Hilbert projective distances, enabling efficient and certified computation even in moderate dimensions (Nielsen, 15 Mar 2024, Nielsen, 2023). The adaptive midpoint-refinement algorithm achieves (1+ϵ)(1+\epsilon)-factor approximation in O(d3log(1/ϵ))O(d^3 \log(1/\epsilon)) time for Gaussians (Nielsen, 2023).

6. Applications and Operational Interpretations

The Fisher–Rao path length quantifies minimal entropy production in near-reversible state transport: in both quantum and classical statistical mechanics, the geodesic length yields the sharp lower bound on total entropy generated when moving a system by infinitesimal sequential equilibrations (Diósi, 5 Oct 2024). The Bhattacharyya (Hellinger) fidelity between initial and final states encodes this irreversibility cost, with the statistical length given by

dFR(p,q)=2arccosF(p,q),F(p,q)=kpkqkd_{\mathrm{FR}}(p, q) = 2 \arccos F(p, q), \quad F(p,q) = \sum_k \sqrt{p_k q_k}

where FF is the Bhattacharyya coefficient (Diósi, 5 Oct 2024, Zhang, 6 Aug 2025).

In masked discrete diffusion models, the Fisher–Rao geodesic coincides with the “cosine schedule,” exemplifying the geometric principle that the time-dependent masking parameter αt=cos2(π2t)\alpha_t = \cos^2(\frac{\pi}{2} t) traverses the probability path at constant FR speed—minimizing path length in the metric (Zhang, 6 Aug 2025).

Applications include statistical signal discrimination, information geometry in clustering and model selection, Wasserstein–Fisher–Rao interpolations in computational imaging, and sample-efficient gradient flows in variational inference and particle methods (Florin, 20 May 2025, Chizat et al., 2015, Marti et al., 2016, Maurais et al., 8 Jan 2024).

7. Notable Families and Unification

Closed-form Fisher–Rao path lengths are available for a wide class of models, summarized in the following table (see (Miyamoto et al., 2023, Mielke, 2 Oct 2025)): | Model Family | Fisher–Rao Distance Formula | Reference | |-----------------------------|--------------------------------------------------------------------------------|---------------| | Discrete simplex, p,qp,q | 2arccos(i=1npiqi)2\arccos\left( \sum_{i=1}^n \sqrt{p_i q_i} \right ) | (Miyamoto et al., 2023) | | Exponential, α,β\alpha,\beta | logαlogβ|\log \alpha - \log \beta| | (Mielke, 2 Oct 2025) | | Poisson, λ1,λ2\lambda_1,\lambda_2 | 2λ2λ12|\sqrt{\lambda_2} - \sqrt{\lambda_1}| | (Miyamoto et al., 2023) | | Gaussian, univariate, μj,σj\mu_j,\sigma_j | $2\sqrt{2}\arctanh\sqrt{ \frac{ (\mu_2-\mu_1)^2 + 2 (\sigma_2-\sigma_1)^2 }{ (\mu_2-\mu_1)^2 + 2 (\sigma_2+\sigma_1)^2 } }$ | (Miyamoto et al., 2023) | | Zero-mean MVN, Σ1,Σ2\Sigma_1,\Sigma_2 | 12i=1d[logλi]2\sqrt{ \frac{1}{2} \sum_{i=1}^d [\log \lambda_i]^2 } (eigenvalues of Σ11Σ2\Sigma_1^{-1} \Sigma_2) | (Wells et al., 2020) |

This unification across models highlights the pivotal role of the Fisher–Rao statistical length as the Riemannian geodesic distance in information geometry. Complex cases may require numerical approximation, but for one- and two-parameter models, closed forms are prevalent (Miyamoto et al., 2023, Nielsen, 15 Mar 2024).


In summary, the Fisher–Rao path length provides a canonical notion of statistical distance with deep geometric, variational, and operational significance, with tractable and explicit models at the core of statistical manifold theory (Chizat et al., 2015, Bauer et al., 2023, Mielke, 2 Oct 2025, Diósi, 5 Oct 2024, Miyamoto et al., 2023, Nielsen, 15 Mar 2024).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Fisher-Rao Path Length.