Partial Soft-Matching Distance for Neural Representational Comparison with Partial Unit Correspondence

Published 22 Feb 2026 in cs.LG, cs.NE, and stat.ML | (2602.19331v1)

Abstract: Representational similarity metrics typically force all units to be matched, making them susceptible to noise and outliers common in neural representations. We extend the soft-matching distance to a partial optimal transport setting that allows some neurons to remain unmatched, yielding rotation-sensitive but robust correspondences. This partial soft-matching distance provides theoretical advantages -- relaxing strict mass conservation while maintaining interpretable transport costs -- and practical benefits through efficient neuron ranking in terms of cross-network alignment without costly iterative recomputation. In simulations, it preserves correct matches under outliers and reliably selects the correct model in noise-corrupted identification tasks. On fMRI data, it automatically excludes low-reliability voxels and produces voxel rankings by alignment quality that closely match computationally expensive brute-force approaches. It achieves higher alignment precision across homologous brain areas than standard soft-matching, which is forced to match all units regardless of quality. In deep networks, highly matched units exhibit similar maximally exciting images, while unmatched units show divergent patterns. This ability to partition by match quality enables focused analyses, e.g., testing whether networks have privileged axes even within their most aligned subpopulations. Overall, partial soft-matching provides a principled and practical method for representational comparison under partial correspondence.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper presents a novel partial soft-matching distance method that selectively aligns corresponding neural units while filtering out noise and inactive elements.
It leverages a partial optimal transport framework with an L-curve heuristic to optimize matching mass, enhancing interpretability through a correlational reformulation.
Empirical validations on fMRI data and deep neural network models show that the method outperforms classical techniques in identifying true signal components.

Partial Soft-Matching Distance for Neural Representational Comparison with Partial Unit Correspondence

Motivation and Problem Formulation

The alignment of neural representations across systems is central to quantifying convergence in learned or biological function. Canonical approaches—such as CKA, RSA, Procrustes, and CCA—provide rotation-invariant and permutation-invariant metrics, abstracting away correspondence between individual units and the axes on which information is encoded. This precludes interpretable neuron-level analysis and confounds understanding of whether homologous computations have emerged, particularly when representations are highly structured or axis-aligned.

The soft-matching distance, building on discrete optimal transport (OT), offers rotation sensitivity but, by design, forces all units across compared populations into correspondence. This mandatory matching produces spurious alignments in the presence of noisy, inactive, or genuinely non-corresponding units—a common scenario in both empirical neural recordings and deep models with architectural or training-induced specialization. Addressing this, the partial soft-matching distance relaxes the marginal constraints of classic OT, allowing only a fraction of the mass (neurons/voxels) to be matched, thus supporting comparison under partial correspondence with robust separation of well-matched and unmatched units.

Methodological Framework

Partial Optimal Transport Extension

Let $X \in \mathbb{R}^{M \times N_X}$ , $Y \in \mathbb{R}^{M \times N_Y}$ denote two neural populations' tuning curves over $M$ stimuli. The approach casts representational comparison as a partial optimal transport problem, optimizing over couplings $T$ with total mass $s \leq 1$ :

$\min_{T \in \mathcal{T}_s(N_X, N_Y)} \langle C, T \rangle_F,$

where $C_{ij}$ is a unit-pairwise cost (preferably cosine distance), and $\mathcal{T}_s$ denotes admissible partial couplings.

Unlike classical OT, row and column sums may not saturate, with the hyperparameter $s$ controlling the total matched mass. Zeroed rows/columns in the optimal $T^*$ identify unmatched units, facilitating structured treatment of noise or divergence. This is particularly useful for empirical neurophysiology data (e.g., fMRI), where many units (voxels) are inherently unreliable or lack clear correspondence.

Regularization and L-Curve Heuristic

Selection of $s$ is posed as a data-dependent regularization problem. An L-curve method traces the Pareto frontier between matching cost and unmatched mass, identifying the optimal $s$ at maximal curvature—akin to classical Tikhonov-style regularization. This approach generalizes across settings without prior knowledge of noise magnitude or outlier abundance.

Interpretability and Correlational Reformulation

When tuning curves are normalized, the objective admits a correlational reinterpretation, transforming the transport optimization into maximization of mean-matched Pearson correlations. Consequently, the method directly quantifies alignment quality in familiar statistical terms, which enhances interpretability in neuroscientific and ANN analysis pipelines.

Empirical Validation

Simulations: Noise Robustness and Model Identification

Simulated populations with varying numbers of signal (matched) and noise (outlier) neurons demonstrate that forced matching (classical soft-matching) leads to inflated transport costs and spurious correspondences. In contrast, partial soft-matching robustly matches the true signal components, discarding noise and recovering ground-truth correspondences. In model selection scenarios with embedded noise, only the partial matching score reliably identifies the model with maximal shared signal, with classical methods often preferring mismatched models due to forced assignment of noise units.

fMRI Comparisons: Neural Population Alignment

Applying partial soft-matching to fMRI data from the Natural Scenes Dataset, the approach automatically excludes low-reliability voxels—those with low noise ceilings—when aligning homologous brain regions across subjects. As the matched mass hyperparameter decreases, the mean noise ceiling of retained voxels increases, demonstrating data-driven unit quality selection. Across all tested visual areas, this method yields higher within-area alignment precision compared to thresholding and forced soft-matching.

Further, the partial soft-matching ranking of voxels by alignment quality matches that of computationally expensive brute-force ablation (which exhaustively recomputes alignments per unit), but with much lower computational cost ( $O(n^3 \log n)$ versus $O(n^4 \log n)$ ).

Deep Neural Networks: Aligned and Divergent Computation

Analyses of ResNet-18 convolutional layers show that units ranked as highly matched by partial soft-matching yield nearly identical maximally exciting images (MEIs) across independently trained models, while unmatched units differ qualitatively. This provides concrete evidence that partial soft-matching discovers genuinely aligned features, and can partition convergent from divergent computational subspaces at scale. Correlation-based heuristics, by contrast, fail to recover these interpretable partitions.

Specificity and Privileged Axes

The method improves alignment specificity by correctly matching homologous regions and not forcing alignment of non-overlapping regions, a property lacking in alternative approaches. Analysis of subpopulations defined by the partial soft-matching scores further enables probing hypotheses about privileged axes in representation, demonstrating that alignment degrades under random orthogonal transformations even within the best-matched unit subsets. This supports the existence of axis-aligned—or “privileged”—coordinates in both artificial and biological systems.

Theoretical and Practical Implications

Partial soft-matching distance advances the toolkit for neural representational analysis by supporting meaningful comparison under partial correspondence. It overcomes the limitations of metric-based or full-matching methods in realistic datasets characterized by noise, outliers, and incomplete overlap. This is crucial for cross-system comparison (e.g., brain-to-brain, ANN-to-ANN, brain-to-ANN) where full one-to-one correspondence is the exception rather than the rule.

The method's ability to partition neural populations enables focused analyses of convergence versus divergence, offering new directions for both explainability and mechanistic hypothesis testing in neuroscience and machine learning. Its computational efficiency means it can be applied to modern large-scale neural datasets, a setting where brute-force approaches become rapidly intractable.

Limitations and Future Directions

The approach relies on heuristic selection of the matched mass via the L-curve, which may not be universally robust. Strict metric properties (e.g., the triangle inequality) are not guaranteed, though the use of newly introduced partial Wasserstein variants satisfying these properties presents an avenue for further extension. The cubic scaling in population size remains a bottleneck for extremely large datasets, motivating further work on scalable solvers.

Potential future work includes formalizing the metric properties of the partial matching under alternative cost functions, exploring Bayesian or learned selection of the matched fraction, and integrating partial soft-matching into downstream analytic and generative pipelines such as structure-preserving alignment in model distillation or cross-modal neuroscience analysis.

Conclusion

Partial soft-matching distance provides a rigorous, interpretable, and efficient approach for representational comparison that is robust to noise, outliers, and incomplete correspondence. Its empirical success in both neuroscience and deep learning settings—and its ability to support nuanced analyses of convergence, alignment specificity, and privileged axes—establish it as a principled alternative to existing metric- and matching-based representational similarity methods. Integration with scalable solvers and further formal extensions will render it applicable to ongoing advances in empirical neuroimaging and AI representation research.

(2602.19331)

Markdown Report Issue