Fisher–Rao Geodesic Optimal Schedule

Updated 3 March 2026

Fisher–Rao–Geodesic Optimal Schedule is a method for interpolating probability densities with constant speed under the Fisher–Rao metric.
It unifies statistical manifold geometry, optimal transport (WF-δ), and generative model interpolation through explicit closed-form solutions.
The approach minimizes the total information length via optimal time reparametrization and offers efficient numerical implementations in imaging and diffusion models.

The Fisher–Rao–Geodesic Optimal Schedule designates any interpolation schedule, time reparametrization, or discretization scheme for probability densities or general nonnegative measures that realizes constant-speed motion (“geodesic”) under the Fisher–Rao metric or related metrics on the space of densities, measures, or measure-valued data. This paradigm unifies the Fisher–Rao geometry of statistical manifolds, nonlinear interpolation on density spaces, geodesic scheduling in generative models such as masked diffusion, the transport–source (“WF-δ”) interpolation between optimal transport and Fisher–Rao, and related structures for matrix-valued measures and varifold metamorphosis. The optimal schedule is defined as the unique parametrization for which the Fisher–Rao metric length is traversed at constant rate, minimizing the total “information length” or squared Fisher–Rao energy, and in all principal geometries of densities admits a closed-form solution.

1. Dynamical Formulation and Geodesic Equations

The Fisher–Rao metric provides a canonical Riemannian structure on the space of positive densities or measures. For general measures $\rho_0,\rho_1$ on a convex domain $\Omega\subset\mathbb{R}^d$ , the interpolation between optimal transport and Fisher–Rao is given by the “WF-δ” functional (Chizat et al., 2015): $\mathrm{WF}_\delta^2(\rho_0,\rho_1) = \inf_{(\rho, v, g)} \frac{1}{2} \int_0^1 \int_\Omega \left[|v(t,x)|^2 \rho(t,x) + \delta^2 |g(t,x)|^2 \rho(t,x)\right]\, dx dt$ subject to the augmented continuity equation

$\partial_t \rho + \nabla_x \cdot (\rho v) = g \rho,\qquad \rho(0)=\rho_0,\, \rho(1)=\rho_1,$

with $v$ the velocity field and $g$ the source (mass generation) term. Euler–Lagrange analysis yields optimality conditions: $M = \rho \nabla\varphi,\qquad Z = \delta^2 \rho \varphi,\qquad \partial_t\varphi + \frac{1}{2}(|\nabla\varphi|^2 + \frac{\varphi^2}{\delta^2}) = 0,$ and the constant Hamiltonian ensures the energy is equidistributed between kinetic and Fisher–Rao components at constant rate. In the pure Fisher–Rao regime ( $v\equiv0$ ), the explicit geodesic schedule is

$\rho(t, x) = \left((1-t)\sqrt{\rho_0(x)} + t \sqrt{\rho_1(x)}\right)^2,$

which is the globally minimizing Fisher–Rao geodesic (Chizat et al., 2015).

2. Fisher–Rao Geodesics and Optimal Parametrization

The classical Fisher–Rao metric on densities $\mu$ on a compact manifold $M$ is

$G^{\rm FR}_\mu(\alpha, \beta) = \int_M \frac{\alpha}{\mu} \frac{\beta}{\mu} \mu,$

where $\alpha, \beta$ are tangent vectors (signed measures with zero total mass for probability densities) (Bruveris et al., 2016). In square-root coordinates $f = \sqrt{\mu}$ , geodesics are straight lines in $L^2$ : $f(t) = (1-t) f_0 + t f_1, \qquad \mu(t) = [f(t)]^2.$ The Fisher–Rao distance is $\| f_1 - f_0 \|_{L^2}$ , and the unique optimal schedule is the identity $s = t$ , since the speed in the Fisher–Rao metric is already constant along this path. For non-geodesic families $\nu(t)$ , the arc-length reparametrization

$s(t) = \frac{1}{d(\nu(0),\,\nu(1))} \int_0^t \left\| \dot{\nu}(\tau) \right\|_{G^{\rm FR}} d\tau$

yields a schedule with constant Fisher–Rao speed (Bruveris et al., 2016), and $s\mapsto \mu(t(s))$ is an arc-length parameterization.

3. Fisher–Rao Optimal Schedules in Discretized and Statistical Models

In discretized and information-theoretic settings, the Fisher–Rao–geodesic schedule governs the reparametrization of time for processes on the statistical manifold of probability distributions. For the 1D family $q_t(x)$ of discrete distributions as in masked diffusion models, the Fisher–Rao line element is

$ds^2 = \sum_x \frac{[\partial_t q_t(x)]^2}{q_t(x)} dt^2 = I(t) dt^2,$

where $I(t)$ is the Fisher information in $t$ (Zhang, 6 Aug 2025). The unique Fisher–Rao geodesic schedule $\varphi(s)$ solves

$\sqrt{I(\varphi(s))} \varphi'(s) = C,$

with solution, for $I(t) = N \dot\alpha_t^2 / [\alpha_t (1-\alpha_t)]$ and $\alpha_t$ monotonic, given by the closed-form “cosine schedule”

$\alpha_{\varphi(s)} = \cos^2(s\pi / 2),$

with the stepwise discretization $t_i = \varphi(i/T)$ for $i=0, \dotsc, T$ , ensuring each step has equal Fisher–Rao length (Zhang, 6 Aug 2025).

Alternative heuristics (linear or quadratic schedules in $\alpha$ or $t$ ) are strictly suboptimal, incurring larger total information length, as shown by their explicit $L > \Lambda(1)$ gap in geodesic length. The Fisher–Rao schedule equalizes information-geometric change per step, yielding optimal discrete approximation to the continuous Fisher–Rao geodesic.

4. Generalizations: $L^p$ Fisher–Rao Schedules and Non-commutative Extensions

Extensions to $L^p$ Fisher–Rao geometries use the $p$ -root map to globally linearize geodesic flow on the Fréchet Lie group $\Diff^{-\infty}(\mathbb{R})$: $\Psi_p(\varphi) = p (\varphi'^{1/p} - 1),$ yielding constant-speed geodesics $f(t) = (1-t)f_0 + t f_1$ in $L^p$ , with density interpolant

$\rho_t(x) = (\varphi_t'(x)) \rho_0(\varphi_t(x)).$

For $p=2$ , this recovers the square-root Fisher–Rao geodesic (Lam, 7 Feb 2026). In the non-commutative matrix-valued setting, Fisher–Rao geodesics $G_t$ evolve as

$G_t = G_0^{1/2} (G_0^{-1/2} G_1 G_0^{-1/2})^t G_0^{1/2},$

with constant “velocity” in the Fisher–Rao metric and linear interpolation in the logarithmic coordinates. The optimal $\varepsilon$ -entropic schedule in Schrödinger-type problems consists in adding Fisher–Rao heat flow near endpoints proportional to $h(t) = \varepsilon\, \min\{t, 1-t\}$ (Monsaingeon et al., 2020).

5. Schedules in Metamorphosis and Varifold Metrics

The Fisher–Rao–geodesic schedule arises in the LDDMM–Fisher–Rao framework for varifold metamorphosis, where point masses (Diracs) or geometric measures interpolate via weight scaling and drift: $r(t) = \left((1-t)\sqrt{r_0} + t\sqrt{r_1}\right)^2, \quad x(t) = x_0 + (x_1-x_0)t.$ Geodesic schedules are explicitly computable in this framework for both spatial and weight evolution, and the optimal schedule achieves constant-speed motion in the composite Riemannian metric (Hsieh et al., 2021).

6. Numerical Schemes and Implementation

Efficient computation of Fisher–Rao–geodesic optimal schedules utilizes proximal-splitting or shooting methods adapted to the metric’s structure. For the WF-δ metric, the staggered-grid discretization enforces the coupled transport–source continuity equation using Douglas–Rachford or least-squares projections, while local proximal operators update state variables by solving decoupled cellwise cubics (Chizat et al., 2015). Similarly, geodesic shooting in LDDMM-FR metamorphosis integrates forward ODEs for state variables, backward for adjoints, and employs gradient-based optimization constrained to constant-speed schedules, with practical implementation relying on kernel acceleration or automatic differentiation as in recent frameworks (Hsieh et al., 2021).

7. Limiting Behavior and Structural Properties

The structural interpolation in the WF-δ metric recovers Fisher–Rao as $\delta \to 0$ (pure source, no transport), and quadratic Wasserstein as $\delta \to \infty$ (pure transport, no creation). The Fisher–Rao metric space of densities is isometric to the positive orthant of the sphere in $L^2$ ; geodesics are globally unique, and the geodesic schedule traverses constant arc-length in this geometry (Bruveris et al., 2016).

These structures ensure stability and completeness of geodesic schedules for optimal information-theoretic transport and for geometric applications in medical imaging, probabilistic modeling, and statistical inference. The closed-form geodesic schedule, in each geometric setting, provides both practical numerical advantages and conceptual clarity for optimal schedule design in generative and statistical models.