Signature-Induced Distances

Updated 22 October 2025

Signature-induced distances are metrics defined from algebraic, combinatorial, and geometric signatures, enabling invariant comparison of paths, graphs, and datasets.
They employ constructions like rough path signatures, DTM-signatures, and spectral signatures to capture intrinsic and extrinsic geometric information with rigorous stability and error bounds.
Applications span manifold learning, graph theory, statistical testing, and quantum geometry, offering both theoretical insight and efficient computational schemes.

Signature-induced distances encompass a rich class of metrics and pseudo-metrics defined via algebraic, combinatorial, and geometric "signatures" associated with mathematical objects such as paths, graphs, metric-measure spaces, and manifolds. These distances arise from abstract invariants—most prominently iterated integrals (the rough path signature), spectral signatures of matrices, and nonlinear functionals—providing deep connections between algebraic encoding and metric geometry. Signature-induced distances have powerful implications for data science, manifold learning, spectral graph theory, quantum geometry, and statistical inference, offering both theoretical guarantees and efficient computational schemes.

1. Fundamental Definition and Mathematical Construction

Signature-induced distances refer to distances (or pseudo-distances) between objects (paths, graphs, spaces, datasets) computed from their algebraic or spectral signatures, rather than directly from their raw elements or embeddings. The rough path signature S(γ) of a path γ:[0,T]→ℝᵈ, for example, is the sequence

$S(\gamma) = \left(1, S^1(\gamma), S^2(\gamma), S^3(\gamma), \ldots\right)$

with

%%%%1%%%%

forming the basic algebraic building block. Distance between paths may be defined by comparing signatures—e.g., ρ_{Sig}(x, y) = ‖Sig(x) − Sig(y)‖—which, under suitable augmentations, defines a true metric up to tree-like equivalence.

In metric-measure spaces, the DTM-signature (Brécheteau, 2017) is constructed from the "distance-to-the-measure" function φ_{μ,m}(x) and its law, yielding an isometry-invariant probability measure. The L₁-Wasserstein distance between DTM-signatures of two spaces,

$W_1(\mu_{DTM,m}, \nu_{DTM,m})$

serves as a pseudo-metric, upper-bounded by the Gromov-Wasserstein distance.

Graph-theoretic signatures include spectra of signed distance matrices, distance Laplacians, or combinatorial invariants—encoding both geometric and sign structure—e.g., as in (K et al., 2020, Roy et al., 2020, Li et al., 2022).

For datasets, intrinsic geometry is captured by spectral signatures (eigenvalues) of diffusion operators. The log-Euclidean signature (LES) distance (Shnitzer et al., 2022) compares datasets via Euclidean distance between ordered log-eigenvalues of SPD matrices,

$d_{LES}^2 = \sum_{i=1}^K \left[\log(\hat{\lambda}^{(1)}_i + \gamma) - \log(\hat{\lambda}^{(2)}_i + \gamma)\right]^2$

with regularization γ and truncation K.

2. Stability, Quantitative Bounds, and Error Estimates

Signature-induced distances admit rigorous stability analyses and quantitative error estimates that elucidate the fidelity of reconstruction and robustness of the metric relation:

Path reconstruction error (Lyons et al., 2014): For γ, $\tilde{\gamma}$ parametrized at constant speed (unit interval)

$|\dot{\gamma}(u) - \tilde{L} \theta_j| < C \eta_k, \text{ for } u \text{ in the } j\text{-th segment}$

with

$\eta_k = \delta(3\epsilon_k) + L / \sqrt{k}, \quad \epsilon_k = \sqrt{2}\left(\sqrt{\delta(1/k)/L} + 1/\sqrt{k}\right)$

where δ is the modulus of continuity of $\dot{\gamma}$ and L is total length. Higher signature levels (larger k) tighten control of error.

DTM-signature lower bounds (Brécheteau, 2017): Under dilatation, for (Y, γ, ν) = λ-scaled (X, δ, μ),

$W_1(\mu_{DTM,m}, \nu_{DTM,m}) = |1 - \lambda| \cdot \mathbb{E}_{\mu}[\phi_{\mu,m}(X)]$

Spectral bounds for signed graphs (Li et al., 2022):

$\max\{ \lambda_n(D_{max}(\Sigma)),\, \lambda_n(D_{min}(\Sigma)) \} \leq -2 \; (d=2)$

Statistical test power for DTM-signature (Brécheteau, 2017):

$Power \geq 1 - 4\exp\left( -\frac{W_1(DTMP, DTMQ)^2}{c n}\right)$

These results show signature-induced distances not only encode structural information but also quantitatively guarantee metric proximity under suitable regularity and truncation, and they relate error levels directly to signature depth or sample size.

3. Symmetrization, Algebraic Structure, and Theoretical Justifications

Many approaches rely on symmetrization and combinatorial averaging to extract robust, scale-invariant, and sign-invariant metrics from potentially oscillatory or noncommutative signature data:

Symmetrization in path inversion (Lyons et al., 2014): High-level signature coefficients are symmetrized over all words with fixed coordinate counts, yielding sums,

$S^n(\ell) = n! \sum_{w:|w|=n,\, |w|_x = \ell} C(w) = \binom{n}{\ell} (\Delta x)^{\ell} (\Delta y)^{n - \ell}$

As n increases, such symmetrization "averages out" small-scale fluctuations and isolates large-scale geometrical directions, allowing robust metric reconstruction.

Injectivity and invariance in DTM-signature (Brécheteau, 2017): The map μ ↦ Law(φ_{μ,m}(X)) is injective under geometric conditions, making the induced pseudo-metric discriminative and stable.
Balance and spectral criteria in signed graphs (K et al., 2020, Roy et al., 2020): Spectral properties of signature-induced (distance Laplacian) matrices characterize balance and compatibility, with zero determinant or cospectrality to underlying unsigned matrices encoding equilibrium or neutrality in sign structure.

4. Applications in Data Science, Geometry, and Quantum Theory

Signature-induced distances play a crucial role in real-world regression/classification for path-valued data, geometric shape comparison, spectral graph classification, and quantum geometric analysis:

Functional Nadaraya–Watson regression on path spaces (Bayer et al., 19 Oct 2025): Signature-transform induces a pseudo-metric making classical kernel regression efficient and scalable for infinite-dimensional time series or stochastic differential equation (SDE) learning. Robust signature normalizations address outlier-induced instability.
Dataset comparison via LES (Shnitzer et al., 2022): LES reliably distinguishes unaligned datasets of varying size, feature number, and modality by comparing spectral signatures of diffusion operators. Applications range from RNA sequencing temporal trajectory analysis to neural network architecture comparison.
Statistical testing and shape comparison (Brécheteau, 2017): DTM-signatures enable invariant, nonparametric, and asymptotically valid statistical tests for equality of metric-measure spaces, with rigorous power guarantees and bootstrap-based critical value estimation.
Quantum geometry and promptness (Piazza et al., 2022, Brahma et al., 2021): In quantum superpositions of spacetime, signature-induced distances via average squared geodesics become non-additive, with the bi-local quantity C(x,y) diagnosing sub-/superadditivity, thus altering causal structure and "promptness" of signals relative to classical expectations.

5. Spectral Signatures, Product Structures, and Network Analysis

Spectral analysis and algebraic manipulation of signature-induced distance matrices offer strong tools for graph/network characterization:

Distance spectrum and extremal bounds (Li et al., 2022): Sharp bounds on least eigenvalues of signed distance matrices help identify extremal graph structures and inform community detection and robustness criteria in network theory.
Product graph compatibility and Kronecker formulas (Shijin et al., 2020): Compatibility of signature-induced distances is preserved under Cartesian, lexicographic, and tensor products of signed graphs only under specific conditions; explicit formulas for resulting distance matrices via Kronecker products enable scalable spectral analysis for large network models.
Block matrix structure and design theory (Ma, 2016): The block decomposition of incidence matrices in design theory provides explicit eigenvalue-counting formulas for signature-induced distances in combinatorial incidence graphs, linking symmetry properties of designs to spectral and metric structure.

6. Geometric Inference, Curvature, and Manifold Learning

Signature-induced distances also encode not just metric but curvature information in geometric and stochastic contexts:

Expected signature on Riemannian manifolds (Geng et al., 18 Jul 2024): High-level asymptotics of the normalized expected signature of a Brownian bridge yield explicit reconstruction of the Riemannian distance function,

$\lim_{n \to \infty} \left( n! \pi_n \psi(t_n, x, y) \right)^{1/n} = d(x, y)$

while the fourth-level expansion and contraction,

$C\psi_4(t, x) = \Theta_x\, t + \Xi_x\, t^2 + O(t^{5/2})$

yield direct measurement of metric tensor (intrinsic) and curvature (extrinsic, via Ricci and second fundamental form) at x.

7. Summary and Synthesis

Signature-induced distances unify algebraic, combinatorial, and geometric perspectives on metric structure, providing both isometry-invariant and scale-invariant metrics suitable for path, graph, geometric, and dataset comparison. Combining injectivity, stability, spectral sharpness, and statistical convergence properties, these distances serve as foundational tools across rough path theory, optimal transport, manifold learning, statistical hypothesis testing, and quantum geometry. Their properties enable both rigorous theoretical insights and scalable computational algorithms, with broad implications for data science, physical geometry, and network analysis.