Papers
Topics
Authors
Recent
Search
2000 character limit reached

Normalized Bures Similarity (NBS) Overview

Updated 10 December 2025
  • Normalized Bures Similarity (NBS) is a metric that quantifies neural representation similarity using quantum-information fidelity and kernel summary statistics.
  • It achieves key invariances—including orthogonal rotation, translation, permutation, and scale—by aligning geometric (Riemannian) and statistical (quantum-information) perspectives.
  • Efficient computational methods such as nuclear norm-based procedures and differentiable optimization make NBS practical for comparing both artificial and biological neural systems.

Normalized Bures Similarity (NBS) is a geometric similarity measure for comparing neural representations, based on the quantum-information fidelity between covariance matrices of activation patterns. Unlike methods based on explicit unit-wise mapping, NBS quantifies similarity in kernel summary statistics, conferring invariance to orthogonal rotations, permutations, translational shifts, and scale. Its foundational identity reveals deep connections to both Riemannian geometry and quantum-information theory, allowing unification of mapping-based and kernel-based similarity frameworks. NBS enjoys properties including metric validity, mapping-free computation, and scale-rotation invariance, and further admits efficient computational schemes and differentiable optimization.

1. Mathematical Definition

Given neural activations from two systems over MM stimuli, stored in matrices XRM×NxX\in\mathbb{R}^{M\times N_x} and YRM×NyY\in\mathbb{R}^{M\times N_y}, the centered linear kernel (stimulus-by-stimulus covariance) matrices are

KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,

where C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top is the M×MM\times M centering matrix. The fidelity between two positive semidefinite (PSD) matrices is

F(KX,KY)=Tr[(KX1/2KYKX1/2)1/2].F(K_X, K_Y) = \mathrm{Tr}\left[\left(K_X^{1/2} K_Y K_X^{1/2}\right)^{1/2}\right].

The normalized Bures similarity is then defined by

NBS(KX,KY)=F(KX,KY)TrKXTrKY,NBS[0,1].\mathrm{NBS}(K_X, K_Y) = \frac{F(K_X, K_Y)}{\sqrt{\mathrm{Tr}\,K_X\,\mathrm{Tr}\,K_Y}}, \qquad \mathrm{NBS}\in[0,1].

Alternatively,

NBS(X,Y)=XCYTr(XCX)Tr(YCY),\mathrm{NBS}(X, Y) = \frac{\|X^\top C Y\|_*}{\sqrt{\mathrm{Tr}(X^\top C X)\,\mathrm{Tr}(Y^\top C Y)}},

where \|\cdot\|_* denotes the nuclear norm (sum of singular values). This construction is equivalent to the cosine of the Riemannian shape distance XRM×NxX\in\mathbb{R}^{M\times N_x}0 between centered neural configurations: XRM×NxX\in\mathbb{R}^{M\times N_x}1 The equivalence between NBS as kernel fidelity and as shape alignment is established rigorously (see (Harvey et al., 2023)).

2. Geometric and Statistical Interpretations

NBS is invariant under permutations and orthogonal rotations of neuron axes; centering removes translational degrees of freedom, and normalization ensures scale invariance. From the geometric perspective, NBS measures the cosine of the Riemannian (geodesic) angle XRM×NxX\in\mathbb{R}^{M\times N_x}2 between two centered configurations, corresponding to optimal Procrustes alignment in the shape manifold. Statistically, XRM×NxX\in\mathbb{R}^{M\times N_x}3 and XRM×NxX\in\mathbb{R}^{M\times N_x}4 are empirical covariance matrices and XRM×NxX\in\mathbb{R}^{M\times N_x}5 measures quantum-information fidelity. The associated Bures distance

XRM×NxX\in\mathbb{R}^{M\times N_x}6

is the 2-Wasserstein distance between zero-mean Gaussians with covariance XRM×NxX\in\mathbb{R}^{M\times N_x}7 and XRM×NxX\in\mathbb{R}^{M\times N_x}8. By Uhlmann’s theorem,

XRM×NxX\in\mathbb{R}^{M\times N_x}9

so NBS represents the maximum normalized Hilbert-Schmidt overlap achievable for neural activations with fixed covariances (Harvey et al., 2023).

3. Computational Procedures

NBS can be computed in two equivalent ways:

  • Kernel Fidelity Method:
  1. Center data: YRM×NyY\in\mathbb{R}^{M\times N_y}0, YRM×NyY\in\mathbb{R}^{M\times N_y}1.
  2. Compute YRM×NyY\in\mathbb{R}^{M\times N_y}2, YRM×NyY\in\mathbb{R}^{M\times N_y}3.
  3. Eigendecompose YRM×NyY\in\mathbb{R}^{M\times N_y}4 to get YRM×NyY\in\mathbb{R}^{M\times N_y}5.
  4. Form YRM×NyY\in\mathbb{R}^{M\times N_y}6 and compute YRM×NyY\in\mathbb{R}^{M\times N_y}7.
  5. Calculate YRM×NyY\in\mathbb{R}^{M\times N_y}8 and denominator YRM×NyY\in\mathbb{R}^{M\times N_y}9.
  6. Output NBS.
  • Cross-Covariance/Nuclear Norm Method:
  1. Center KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,0 with KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,1.
  2. Compute KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,2, KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,3, KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,4.
  3. Obtain singular values KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,5 of KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,6, set KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,7.
  4. Compute traces KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,8, KX=CXXC,KY=CYYC,K_X = C X X^\top C, \qquad K_Y = C Y Y^\top C,9.
  5. Output NBS.

For C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top0, the nuclear norm method is computationally preferable. As code, NBS amounts to mean-centering, computing cross-covariance, applying SVD, and normalizing by the geometric mean of marginal nuclear norms (Cloos et al., 2024): F(KX,KY)=Tr[(KX1/2KYKX1/2)1/2].F(K_X, K_Y) = \mathrm{Tr}\left[\left(K_X^{1/2} K_Y K_X^{1/2}\right)^{1/2}\right].2 Differentiable optimization is enabled by autograd-capable SVD implementations; the nuclear norm gradient is stable for distinct singular values (Cloos et al., 2024).

4. Sensitivity to Principal Components

NBS exhibits linear sensitivity to principal component (PC) variances. If C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top1 and a copy C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top2 is constructed by scrambling the C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top3 left singular vector (preserving variance C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top4), then

C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top5

where C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top6 are the eigenvalues of C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top7 (Cloos et al., 2024). Thus the decrease in NBS upon destroying PC C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top8 is linear in its variance. By contrast, CKA’s dependence is quadratic (C=IM1M11C = I_M - \frac{1}{M}\mathbf{1}\mathbf{1}^\top9), making NBS more sensitive to mid-range PCs than CKA, but less so than angular Procrustes, which is most strongly sensitive to low-variance directions.

5. Metric Properties, Theorems, and Comparison to Other Measures

NBS provides a true metric (satisfying symmetry and triangle inequality) over stimulus–response geometries: M×MM\times M0 The associated Bures distance M×MM\times M1 is dual to the Procrustes “size-and-shape” distance M×MM\times M2: M×MM\times M3 valid for neural configurations of unequal width M×MM\times M4 (Harvey et al., 2023). Asymptotically, with M×MM\times M5 or M×MM\times M6, the normalized Bures distance converges to its limiting form under the law of large numbers. Empirically, NBS values between random and real data start near 0, approach 0.98–1.0 under optimization, and dataset-dependent thresholds exist for meaningful “task-relevant” encoding.

Compared to other metrics:

  • RSA compares vectorized representational dissimilarity matrices, does not define a true metric, and ignores global scaling.
  • CKA computes

M×MM\times M7

and is the cosine of the Hilbert–Schmidt angle. Unlike NBS, CKA neither exploits PSD geometry nor satisfies the triangle inequality, and it can be insensitive to alignment of dominant covariance subspaces. Tight bounds relate CKA and NBS:

M×MM\times M8

and empirical discrepancies can be two- to three-fold.

  • CCA fits optimal linear mappings to maximize correlation in a shared subspace; mapping-based, affine-invariant, requiring generalized eigenproblems, and returns multiple coefficients. NBS is mapping-free, giving a single overlap scalar.

NBS should be preferred when metric validity, PSD-manifold geometry, and mapping-free similarity are desired.

6. Empirical Behavior, Optimization, and Interpretation

Differentiable optimization allows maximizing NBS directly. Under optimization, synthetic data initially learns the highest-variance PC of the target dataset, with NBS capturing lower-variance PCs more rapidly than CKA but less so than angular Procrustes. While high NBS scores (M×MM\times M9) can be achieved, they do not guarantee encoding of all task-relevant dimensions; “overfitting” to highest-variance PCs is possible. No single threshold for “good” NBS exists; the appropriate value depends on dataset structure (e.g., F(KX,KY)=Tr[(KX1/2KYKX1/2)1/2].F(K_X, K_Y) = \mathrm{Tr}\left[\left(K_X^{1/2} K_Y K_X^{1/2}\right)^{1/2}\right].0 NBS suffices for high task-decoding accuracy in some prefrontal recordings, while others require F(KX,KY)=Tr[(KX1/2KYKX1/2)1/2].F(K_X, K_Y) = \mathrm{Tr}\left[\left(K_X^{1/2} K_Y K_X^{1/2}\right)^{1/2}\right].1).

Empirical scatter plots show that CKA and NBS are correlated, but substantial envelope width remains due to kernel ranks and matrix square-root non-commutativity. Joint optimization experiments indicate that high angular Procrustes score entails high NBS and CKA, but not vice-versa; very high CKA or NBS can still omit lower-variance, task-relevant structure (Cloos et al., 2024).

7. Applications and Limitations

NBS is applicable for quantifying similarity between neural representations, including artificial and biological systems, without explicit neuron correspondences. It is effective for characterizing neural encoding overlap under PSD-manifold geometry. NBS’s mapping-free nature, scale and rotation invariance, and metric validity offer distinct advantages over ad hoc measures. However, NBS (like kernel-based measures) is susceptible to “overfitting” dominant PCs and may not ensure task-relevant dimensionality recovery. For practitioners, careful interpretation is required, especially when optimizing representations: additional analysis concerning encoding of low-variance dimensions and task variables is necessary for robust assessment.

NBS unifies geometric (shape manifold) and statistical (quantum-information and Wasserstein) perspectives on neural similarity, and its properties are increasingly prominent in comparative studies of neural representations in deep learning and neuroscience (Harvey et al., 2023, Cloos et al., 2024).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Normalized Bures Similarity (NBS).