Papers
Topics
Authors
Recent
Search
2000 character limit reached

Reference-Log-Linear Distances

Updated 11 May 2026
  • Reference-log-linear distances are metrics that locally linearize Riemannian geodesic distances using logarithmic and exponential maps from a fixed reference measure.
  • They enable efficient embedding of complex objects like probability measures and SPD matrices into tangent Hilbert spaces, facilitating scalable computational comparisons.
  • The approach maintains key geometric information and provides exact metric recovery on geodesics, making it valuable for optimal transport and data analysis.

Reference-log-linear distances are metrics or pseudo-metrics that leverage the local linearization of a geodesic distance (often Riemannian) around a chosen reference point in a metric space. This construction enables embedding complex objects—such as probability measures or positive-definite matrices—into (pre-)Hilbert spaces, facilitating computationally efficient comparisons while preserving substantial geometric information from the underlying metric. The concept arises prominently in the context of optimal transport-based geometries, especially for the Hellinger–Kantorovich (HK) metric, and is a particular example of tangent space embedding via explicit logarithmic and exponential maps (Cai et al., 2021).

1. Riemannian Metric Structure and Local Linearization

Reference-log-linear distances rely on exploiting the Riemannian metric structure of a space of interest. For the space of non-negative Radon measures on a domain ΩRd\Omega \subset \mathbb{R}^d, the HK metric defines a Riemannian structure via a dynamic (Benamou–Brenier-type) formulation:

HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,

subject to conservation tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t, with ρ0=μ0\rho_0 = \mu_0, ρ1=μ1\rho_1 = \mu_1.

At any reference measure μ0\mu_0 \ll Lebesgue, the tangent space can be identified with triples ξ=(v,α,ν)\xi = (v, \alpha, \sqrt{\nu}), where vv is a vector field, α\alpha a scalar function, and ν\nu a singular measure. The inner product is defined as

HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,0

where HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,1 is any dominating reference for the singular parts (Cai et al., 2021).

This formalism enables the exact evaluation of the squared length of tangent vectors and local linearization of the HK metric around HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,2.

2. Logarithmic and Exponential Maps in Measure Spaces

The logarithmic map HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,3 provides a vector in the tangent space at HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,4 representing the direction and speed of the geodesic from HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,5 to HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,6. Given a fixed reference measure HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,7, for any sample HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,8 the Log map is computed via:

  • Solution of a static soft-marginal Kantorovich problem with cost

HK2(μ0,μ1)=inf(ρt,ωt,ζt)01Ω[ωt/ρt2+14(ζt/ρt)2]dρtdt,\mathrm{HK}^2(\mu_0, \mu_1) = \inf_{(\rho_t, \omega_t, \zeta_t)} \int_0^1 \int_\Omega \left[ \|\omega_t/\rho_t\|^2 + \tfrac{1}{4} (\zeta_t/\rho_t)^2 \right] d\rho_t\, dt,9

  • Derivation of the Monge-form minimizer tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t0,
  • Decomposition of tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t1, tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t2 with respect to their marginal densities, and
  • Assembly of the tangent fields tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t3 (see details in (Cai et al., 2021), Proposition 3.9).

The exponential map tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t4 provides an explicit inversion, reconstructing measures from tangent vectors, and for geodesics starting at tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t5 the HK metric is given exactly by the norm of this tangent vector.

3. Construction and Properties of the Reference-Log-Linear Distance

The reference-log-linear distance, defined for measures tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t6 relative to a fixed reference tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t7, is: tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t8 Notably, for tρt+divωt=ζt\partial_t \rho_t + \operatorname{div}\omega_t = \zeta_t9 close to ρ0=μ0\rho_0 = \mu_00, this linearized distance provides a first-order approximation to the full metric: ρ0=μ0\rho_0 = \mu_01 while for the special case of geodesics from ρ0=μ0\rho_0 = \mu_02, it provides the exact metric value: ρ0=μ0\rho_0 = \mu_03.

This construction admits an efficient computational recipe: for ρ0=μ0\rho_0 = \mu_04 samples, each is embedded through ρ0=μ0\rho_0 = \mu_05 one-to-reference OT computations, and pairwise comparisons reduce to Euclidean operations in the tangent Hilbert space, substantially reducing computational cost compared to ρ0=μ0\rho_0 = \mu_06 pairwise metric evaluations (Cai et al., 2021).

4. Algorithmic Implementation and Discrete Setting

For discretely supported measures, the procedure involves:

  • Solving ρ0=μ0\rho_0 = \mu_07 entropic-regularized unbalanced optimal transport problems (using soft marginals),
  • For each, barycentric projection of transport plans to construct the tangent fields ρ0=μ0\rho_0 = \mu_08,
  • Assembly of the tangent space embeddings,
  • Pairwise distances computed by the Hilbert norm—in practice, this involves summing ρ0=μ0\rho_0 = \mu_09 plus the Hellinger distance for unmatched mass.

The embedding enables subsequent data analysis—such as PCA, clustering, or SVM—in the tangent Hilbert space at the reference. This approach offers an efficient, scalable surrogate to the HK geometry for applications requiring the analysis of large numbers of empirical measures.

5. Analytical and Practical Significance

The reference-log-linear approach preserves sensitivity to the unbalanced optimal transport geometry while providing access to the computational and analytical toolkit of linear spaces. The method is exact on geodesics from the reference and provides a first-order approximation near the reference, making it particularly suitable for tasks where most measures are expected to be close in HK distance to the reference.

By reducing metric structure to inner products, it enables the use of a broad suite of machine learning and data analysis methods traditionally restricted to Euclidean settings, while maintaining an interpretable connection to the original HK metric (Cai et al., 2021). This approach also avoids the quadratic scaling of full pairwise metric computations.

6. Connections to Other Linearization and Spectral Approaches

Reference-log-linear embeddings are part of a broader class of linearization techniques for geometrically complex distance spaces. Similar spectral or tangent-space techniques appear, for instance, in the log-Euclidean signatures framework for SPD matrices. There, the log-Euclidean (LE) metric is linearized by considering differences in log-spectra, and a dataset is embedded via distances to a fixed collection of reference matrices (Shnitzer et al., 2022). In both settings, the critical insight is that carefully chosen reference-based embeddings preserve local geometry and metric information to first order while enabling highly efficient downstream analysis.

A plausible implication is that such log-linearization schemes can be systematically generalized to other Riemannian metric spaces of interest in data science and geometry, wherever explicit log and exp maps and inner products in tangent spaces are accessible.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Reference-Log-Linear Distances.