Correcting Biased Centered Kernel Alignment Measures in Biological and Artificial Neural Networks (2405.01012v1)
Abstract: Centred Kernel Alignment (CKA) has recently emerged as a popular metric to compare activations from biological and artificial neural networks (ANNs) in order to quantify the alignment between internal representations derived from stimuli sets (e.g. images, text, video) that are presented to both systems. In this paper we highlight issues that the community should take into account if using CKA as an alignment metric with neural data. Neural data are in the low-data high-dimensionality domain, which is one of the cases where (biased) CKA results in high similarity scores even for pairs of random matrices. Using fMRI and MEG data from the THINGS project, we show that if biased CKA is applied to representations of different sizes in the low-data high-dimensionality domain, they are not directly comparable due to biased CKA's sensitivity to differing feature-sample ratios and not stimuli-driven responses. This situation can arise both when comparing a pre-selected area of interest (e.g. ROI) to multiple ANN layers, as well as when determining to which ANN layer multiple regions of interest (ROIs) / sensor groups of different dimensionality are most similar. We show that biased CKA can be artificially driven to its maximum value when using independent random data of different sample-feature ratios. We further show that shuffling sample-feature pairs of real neural data does not drastically alter biased CKA similarity in comparison to unshuffled data, indicating an undesirable lack of sensitivity to stimuli-driven neural responses. Positive alignment of true stimuli-driven responses is only achieved by using debiased CKA. Lastly, we report findings that suggest biased CKA is sensitive to the inherent structure of neural data, only differing from shuffled data when debiased CKA detects stimuli-driven alignment.
- Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell., 35(8):1798–1828, aug 2013. doi: 10.1109/TPAMI.2013.50.
- img2fmri: a python package for predicting group-level fmri responses to visual stimuli using deep neural networks. Aperture Neuro, 3, October 2023. doi: 10.52294/001c.87545.
- Algorithms for learning kernels based on centered alignment. J. Mach. Learn. Res., 13(1):795–828, mar 2012.
- Aligning model and macaque inferior temporal cortex representations improves model-to-human behavioral alignment and adversarial robustness. In The Eleventh International Conference on Learning Representations, 2023.
- Deceiving the CKA similarity measure in deep learning. In NeurIPS ML Safety Workshop, 2022a.
- On the inadequacy of CKA as a measure of similarity in deep learning. In ICLR 2022 Workshop on Geometrical and Topological Representation Learning, 2022b.
- Reliability of CKA as a similarity measure in deep learning. In The Eleventh International Conference on Learning Representations, 2023.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee, 2009.
- Improved object recognition using neural networks trained to mimic the brain’s statistical properties. Neural Networks, 131:103–114, 2020. doi: https://doi.org/10.1016/j.neunet.2020.07.013.
- System identification of neural systems: If we got it right, would we know? (arXiv:2302.06677), February 2023. arXiv:2302.06677 [cs, q-bio].
- Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.
- Things-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior. eLife, 12:e82580, February 2023. doi: 10.7554/eLife.82580.
- Similarity of neural network models: A survey of functional and representational measures, 2023.
- Atherosclerosis and liver inflammation induced by increased dietary cholesterol intake: a combined transcriptomics and metabolomics analysis. Genome Biology, 8(9):R200, 2007. ISSN 14656906. doi: 10.1186/gb-2007-8-9-r200.
- Similarity of neural network representations revisited. In Proceedings of the 36th International Conference on Machine Learning, volume 97, pp. 3519–3529. PMLR, 09–15 Jun 2019.
- Representational similarity analysis - connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2, 2008.
- Cornet: Modeling the neural mechanisms of core object recognition. bioRxiv, 09/2018 2018.
- A geometric understanding of deep learning. Engineering, 6(3):361–374, 2020. doi: https://doi.org/10.1016/j.eng.2019.09.010.
- Individual differences among deep neural network models. Nature Communications, 11(1):5725, November 2020. doi: 10.1038/s41467-020-19632-w.
- Thingsvision: A python toolbox for streamlining the extraction of activations from deep neural networks. Frontiers in Neuroinformatics, 15:45, 2021. doi: 10.3389/fninf.2021.679838.
- Do wide and deep networks learn the same things? uncovering how neural network representations vary with width and depth. In International Conference on Learning Representations, 2021.
- Improving the accuracy and robustness of cnns using a deep cca neural data regularizer, 2022.
- Improving the accuracy of single-trial fmri response estimates using glmsingle. eLife, 11:e77599, November 2022. doi: 10.7554/eLife.77599.
- P. Robert and Y. Escoufier. A unifying tool for linear multivariate statistical methods: The rv-coefficient. Journal of the Royal Statistical Society Series C: Applied Statistics, 25(3):257–265, December 2018. doi: 10.2307/2347233.
- Brain-score: Which artificial neural network for object recognition is most brain-like? bioRxiv preprint, 2018.
- Integrative benchmarking to advance neurally mechanistic models of human intelligence. Neuron, 2020.
- Matrix correlations for high-dimensional data: the modified rv-coefficient. Bioinformatics, 25(3):401–405, December 2008. doi: 10.1093/bioinformatics/btn634.
- Deep learning in neuroimaging: Overcoming challenges with emerging approaches. Frontiers in Psychiatry, 13, 2022. doi: 10.3389/fpsyt.2022.912600.
- Supervised feature selection via dependence estimation. In Proceedings of the 24th International Conference on Machine Learning, ICML ’07, pp. 823–830, New York, NY, USA, 2007. Association for Computing Machinery. ISBN 9781595937933. doi: 10.1145/1273496.1273600.
- Getting aligned on representational alignment, 2023.
- Partial distance correlation with methods for dissimilarities. The Annals of Statistics, 42(6):2382–2412, 2014. doi: 10.1214/14-AOS1255.
- The effect of task and training on intermediate representations in convolutional neural networks revealed with modified rv similarity analysis. In 2019 Conference on Cognitive Computational Neuroscience, 2019. doi: 10.32470/CCN.2019.1300-0. arXiv:1912.02260 [cs, stat].
- Where is human v4? predicting the location of hv4 and vo1 from cortical folding. Cerebral Cortex, 24(9):2401–2408, April 2013.
- Daniel L K Yamins and James J DiCarlo. Using goal-driven deep learning models to understand sensory cortex. Nature Neuroscience, 19(3):356–365, March 2016.
- Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111(23):8619–8624, 2014.
- A general index for linear and nonlinear correlations for high dimensional genomic data. BMC Genomics, 21(1):846, November 2020. doi: 10.1186/s12864-020-07246-x.