Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds

Published 30 Mar 2026 in cs.LG, cs.AI, math.DG, and q-bio.NC | (2603.28764v1)

Abstract: Similarity measures are widely used to interpret the representational geometries used by neural networks to solve tasks. Yet, because existing methods compare the extrinsic geometry of representations in state space, rather than their intrinsic geometry, they may fail to capture subtle yet crucial distinctions between fundamentally different neural network solutions. Here, we introduce metric similarity analysis (MSA), a novel method which leverages tools from Riemannian geometry to compare the intrinsic geometry of neural representations under the manifold hypothesis. We show that MSA can be used to i) disentangle features of neural computations in deep networks with different learning regimes, ii) compare nonlinear dynamics, and iii) investigate diffusion models. Hence, we introduce a mathematically grounded and broadly applicable framework to understand the mechanisms behind neural computations by comparing their intrinsic geometries.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces Metric Similarity Analysis (MSA) as a novel method to compare intrinsic neural geometries using Riemannian metrics derived from network Jacobians.
It employs the spectral ratio metric to differentiate between rich and lazy learning regimes, effectively capturing functional disparities beyond extrinsic similarities.
MSA extends to dynamical systems and statistical manifolds, offering a unified framework for analyzing RNNs, SSMs, and diffusion models.

Geometry-aware Similarity Metrics for Neural Representations on Riemannian and Statistical Manifolds

Introduction

This paper introduces Metric Similarity Analysis (MSA), a framework leveraging Riemannian geometry to compare intrinsic neural representations under the manifold hypothesis. Existing similarity metrics, such as CKA, CCA, RSA, and Procrustes, operate on extrinsic geometry—measuring how representations are embedded in state space. This approach fails to distinguish neural network solutions that are functionally distinct yet share similar embeddings, or that are intrinsically similar but extrinsically dissimilar. MSA resolves this limitation by quantifying the similarity of intrinsic geometries—specifically, the Riemannian metrics induced on the input manifold by neural network transformations.

Figure 1: Demonstration of how intrinsic geometry can remain invariant under extrinsic transformations, motivating geometry-aware analysis.

Mathematical Framework

MSA considers a neural network as a map from an input manifold $\mathcal{M}$ —an intrinsically low-dimensional structure postulated by the manifold hypothesis—into a high-dimensional activation space. The intrinsic geometry of representations is captured via the pullback metric, defined through the Jacobian of the mapping. At each point $p \in \mathcal{M}$ , the pullback metric $G(p) = J(p)^\top J(p)$ (with $J(p)$ the Jacobian at $p$ ) is a symmetric positive definite (SPD) matrix. This matrix encodes the local geometry of the representation manifold.

Comparing intrinsic geometries between two networks $\varphi^1, \varphi^2$ requires a distance on the space of SPD matrices. The spectral ratio (SR) metric is proposed for this purpose:

Figure 2: Visualization of the pullback metric, which encodes how local geometry is shaped by the neural representation.

The SR is defined in terms of generalized eigenvalues and is a bounded pseudo-distance:

$d_{\mathrm{SR}}(G, G') = 1 - \sqrt{\frac{\lambda_m}{\lambda_1}}$

where $\lambda_1 \geq \cdots \geq \lambda_m > 0$ are the ordered generalized eigenvalues of $(G, G')$ . The SR is invariant to invertible transformations and satisfies symmetry and the triangle inequality, establishing it as a meaningful pseudo-metric on SPD matrices.

Figure 3: The spectral ratio defines a geometry-aware similarity between SPD matrices, illustrated on $2 \times 2$ SPD cones.

MSA is then defined as the average spectral ratio integrated over the input manifold:

$p \in \mathcal{M}$ 0

where $p \in \mathcal{M}$ 1 is the pullback metric associated with $p \in \mathcal{M}$ 2 at $p \in \mathcal{M}$ 3. This definition yields a distance on the space of Riemannian metrics, reflecting functional similarity between representations independent of embedding.

Figure 4: MSA applied to two networks mapping the same manifold, yet producing distinct intrinsic geometries.

Empirical Results

Disentangling Rich and Lazy Learning Regimes

MSA discerns between 'rich' and 'lazy' learning regimes—distinct computational solutions that conventional extrinsic metrics fail to differentiate. In one-hidden-layer networks trained for classification on a 2D manifold, PCA visualizations suggest high embedding similarity. However, MSA detects major intrinsic differences: Procrustes analysis yields near-maximal similarity between models with qualitatively distinct inductive biases, while MSA reports substantially lower similarity (Fig. 5d), in line with theoretical expectations regarding representational diversity.

Figure 5: Contrasting the learned intrinsic geometries for rich and lazy networks, MSA clearly discriminates between them.

Hierarchical variation of tasks, initializations, and networks demonstrates that MSA’s similarity assignments robustly match task-induced functional structure; metrics like RSA and CCA fail to exhibit such discriminability.

Figure 6: MSA captures functionally meaningful hierarchical structure across a spectrum of models, outperforming classical alternatives.

Analysis of Dynamics: Recurrent and State Space Models

MSA naturally extends to nonlinear dynamical systems, allowing joint assessment of geometry and temporal evolution. Analysis of RNNs and structured state-space models (SSMs) trained on a memory task shows MSA clustering models by architecture and learning regime, capturing both geometric and dynamical distinctions. Moreover, MSA similarity dynamics across time can track transitions in representational geometry—information inaccessible to geometry-only (RSA) or dynamics-only (DSA) approaches.

Figure 7: MSA partitions recurrent (RNN) and structured (SSM) models by both architecture and training, integrating spatial and temporal metrics.

Diffusion Models and Statistical Manifolds

MSA generalizes to probabilistic latent space analysis by unifying the pullback and Fisher-Rao metrics on statistical manifolds. In the context of diffusion models (e.g., Stable Diffusion XL), MSA quantifies how conditioning and guidance affect latent geometry across diffusion time. The method reveals that increases in classifier-free guidance modulate the similarity between the statistical manifold of guided and unguided models, exposing critical behaviors like latent collapse or geometric divergence.

Figure 8: MSA reveals evolution and variation of the latent statistical manifold in diffusion models conditioned on guidance.

Theoretical Properties

MSA is invariant under orthogonal transformations of the state space, ensuring fair comparison regardless of equivalent reparameterizations. It also possesses invariance to local coordinate choices on the input manifold, as shown through formal propositions. This coordinate and rotation invariance aligns with core Riemannian geometric principles and distinguishes MSA from embedding-based approaches sensitive to arbitrary choices of basis.

Discussion and Implications

MSA provides a mathematically rigorous framework for meaningful, geometry-aware comparison of neural representations. Its ability to resolve differences between functionally divergent networks—beyond what is captured by embedding geometry—clarifies ambiguities inherent in the mechanistic interpretability of neural computations. Practically, MSA offers a principled tool for addressing major challenges in model selection, network auditing, and cross-architecture analysis.

Limitations of MSA include reliance on explicit manifold parameterization and lack of direct downstream function consideration, making its application to under-sampled or opaque data domains non-trivial. Future directions involve integrating MSA with data-driven manifold learning and extending its principles to guide model modification towards specific geometric properties.

Conclusion

By leveraging Riemannian structure, MSA enables the comparison of neural representations at the level of their intrinsic computation, independent of extrinsic embedding. This approach outperforms standard similarity measures in differentiating between representations with distinct functional properties and extends naturally to RNNs, SSMs, and diffusion models, positioning MSA as a foundational tool for mechanistic neural network analysis and informed model comparison.

Reference: "Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds" (2603.28764)

Markdown Report Issue