A Spectral Theory of Neural Prediction and Alignment (2309.12821v2)
Abstract: The representations of neural networks are often compared to those of biological systems by performing regression between the neural network responses and those measured from biological systems. Many different state-of-the-art deep neural networks yield similar neural predictions, but it remains unclear how to differentiate among models that perform equally well at predicting neural responses. To gain insight into this, we use a recent theoretical framework that relates the generalization error from regression to the spectral properties of the model and the target. We apply this theory to the case of regression between model activations and neural responses and decompose the neural prediction error in terms of the model eigenspectra, alignment of model eigenvectors and neural responses, and the training set size. Using this decomposition, we introduce geometrical measures to interpret the neural prediction error. We test a large number of deep neural networks that predict visual cortical activity and show that there are multiple types of geometries that result in low neural prediction error as measured via regression. The work demonstrates that carefully decomposing representational metrics can provide interpretability of how models are capturing neural activity and points the way towards improved models of neural activity.
- Do we know what the early visual system does? Journal of Neuroscience, 25(46):10577–10597, 2005.
- Marcel AJ van Gerven. A primer on encoding models in sensory neuroscience. Journal of Mathematical Psychology, 76:172–183, 2017.
- Encoding and decoding in fmri. Neuroimage, 56(2):400–410, 2011.
- The contrast sensitivity of retinal ganglion cells of the cat. The Journal of physiology, 187(3):517–552, 1966.
- Spatiotemporal energy models for the perception of motion. Josa a, 2(2):284–299, 1985.
- Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the national academy of sciences, 111(23):8619–8624, 2014.
- Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS computational biology, 10(11):e1003915, 2014.
- A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron, 98(3):630–644, 2018.
- A deep learning framework for neuroscience. Nature neuroscience, 22(11):1761–1770, 2019.
- Grace W Lindsay. Convolutional neural networks as a model of the visual system: Past, present, and future. Journal of cognitive neuroscience, 33(10):2017–2031, 2021.
- Brain-score: Which artificial neural network for object recognition is most brain-like? BioRxiv, page 407007, 2018.
- Many but not all deep neural network audio models capture brain responses and exhibit hierarchical region correspondence. bioRxiv, pages 2022–09, 2022.
- What can 1.8 billion regressions tell us about the pressures shaping high-level visual representation in brains and machines? BioRxiv, pages 2022–03, 2022.
- If deep learning is the answer, what is the question? Nature Reviews Neuroscience, 22(1):55–67, January 2021.
- Model metamers reveal divergent invariances between biological and artificial neural networks. Nature Neuroscience, pages 1–18, 2023.
- Partial success in closing the gap between human and machine vision. Advances in Neural Information Processing Systems, 34:23885–23899, 2021.
- The origins and prevalence of texture bias in convolutional neural networks. Advances in Neural Information Processing Systems, 33:19000–19015, 2020.
- Generalisation in humans and deep neural networks. Advances in neural information processing systems, 31, 2018.
- Improving neural network representations using human similarity judgments. arXiv preprint arXiv:2306.04507, 2023.
- Controversial stimuli: Pitting neural networks against each other as models of human cognition. Proceedings of the National Academy of Sciences, 117(47):29330–29337, 2020.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Svcca: Singular vector canonical correlation analysis for deep learning dynamics and interpretability. Advances in neural information processing systems, 30, 2017.
- Similarity of neural network representations revisited. In International Conference on Machine Learning, pages 3519–3529. PMLR, 2019.
- What shapes feature representations? exploring datasets, architectures, and training. Advances in Neural Information Processing Systems, 33:9995–10006, 2020.
- SueYeon Chung and LF Abbott. Neural population geometry: An approach for understanding biological and artificial neural networks. Current opinion in neurobiology, 70:137–144, 2021.
- Separability and geometry of object manifolds in deep neural networks. Nature Communications, 11(1):746, Feb 2020.
- Interpreting neural computations by examining intrinsic and embedding dimensionality of neural activity. Current opinion in neurobiology, 70:113–120, 2021.
- Intrinsic dimension of data representations in deep neural networks. Advances in Neural Information Processing Systems, 32, 2019.
- Neural representational geometry underlies few-shot concept learning. Proceedings of the National Academy of Sciences, 119(43):e2200800119, 2022.
- Neural population control via deep image synthesis. Science, 364(6439):eaav9436, 2019.
- High-performing neural network models of visual cortex benefit from high latent dimensionality. bioRxiv, pages 2022–07, 2022.
- Spectrum dependent learning curves in kernel regression and wide neural networks. In International Conference on Machine Learning, pages 1024–1034. PMLR, 2020.
- Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks. Nature communications, 12(1):2914, 2021.
- A functional and perceptual signature of the second visual area in primates. Nature Neuroscience, 16(7):974–981, Jul 2013.
- Simple learned weighted sums of inferior temporal neuronal firing rates accurately predict human core object recognition performance. Journal of Neuroscience, 35(39):13402–13418, 2015.
- Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, 2002.
- Gaussian processes for machine learning, volume 2:3. MIT press Cambridge, MA, 2006.
- Kernel methods in machine learning. The Annals of Statistics, 36(3):1171, 2008.
- Spectral analysis of large dimensional random matrices, volume 20. Springer, 2010.
- Implicit regularization of random feature models. In International Conference on Machine Learning, pages 4631–4640. PMLR, 2020.
- Learning curves of generic features maps for realistic datasets with a teacher-student model. Advances in Neural Information Processing Systems, 34:18137–18151, 2021.
- Statistical mechanics of learning from examples. Physical review A, 45(8):6056, 1992.
- Brain hierarchy score: Which deep neural networks are hierarchically brain-like? IScience, 24(9):103013, 2021.
- Diverse task-driven modeling of macaque v4 reveals functional specialization towards semantic tasks. bioRxiv, pages 2022–05, 2022.
- How well do deep neural networks trained on object recognition characterize the mouse visual system? In Real Neurons {normal-{\{{\normal-\\backslash\&}normal-}\}} Hidden Units: Future directions at the intersection of neuroscience and artificial intelligence@ NeurIPS 2019, 2019.
- Deep convolutional models improve predictions of macaque v1 responses to natural images. PLoS computational biology, 15(4):e1006897, 2019.
- Adversarially trained neural representations are already as robust as biological neural representations. In International Conference on Machine Learning, pages 8072–8081. PMLR, 2022.
- Simulating a primary visual cortex at the front of cnns improves robustness to image perturbations. Advances in Neural Information Processing Systems, 33:13073–13087, 2020.
- Do adversarially robust imagenet models transfer better? In Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 3533–3545, 2020.
- Robustness (python library), 2019.
- On kernel-target alignment. Advances in neural information processing systems, 14, 2001.
- Representational similarity analysis-connecting the branches of systems neuroscience. Frontiers in systems neuroscience, page 4, 2008.
- High-dimensional geometry of population responses in visual cortex. Nature, 571(7765):361–365, 2019.
- A theory of multineuronal dimensionality, dynamics and measurement. BioRxiv, page 214262, 2017.
- Neural system identification for large populations separating “what” and “where”. Advances in neural information processing systems, 30, 2017.
- A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature, 331(6158):679–684, 1988.
- Neural population geometry reveals the role of stochasticity in robust perception. Advances in Neural Information Processing Systems, 34:15595–15607, 2021.
- Divisive feature normalization improves image recognition performance in alexnet. In International Conference on Learning Representations, 2022.
- On 1/n neural representation and robustness. Advances in Neural Information Processing Systems, 33:6211–6222, 2020.
- Harold Widom. Lectures on integral equations. Courier Dover Publications, 2016.
- Out-of-distribution generalization in kernel regression. Advances in Neural Information Processing Systems, 34:12600–12612, 2021.
- The eigenlearning framework: A conservation law perspective on kernel regression and wide neural networks, 2022.
- Noureddine El Karoui. The spectrum of kernel random matrices. The Annals of Statistics, 38(1):1 – 50, 2010.
- Grounding representation similarity with statistical testing. arXiv preprint arXiv:2108.01661, 2021.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- JAX: composable transformations of Python+NumPy programs, 2018.
- ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009.
- Alex Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997, 2014.
- Deep residual learning for image recognition, 2015.
- Wide residual networks. arXiv preprint arXiv:1605.07146, 2016.
- Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017.
- A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11976–11986, 2022.
- Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, 2019.
- Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
- Cornet: Modeling the neural mechanisms of core object recognition. BioRxiv, page 408385, 2018.
- Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning, pages 12310–12320. PMLR, 2021.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9640–9649, 2021.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Computer vision with a single (robust) classifier. In ArXiv preprint arXiv:1906.09453, 2019.
- Gaussian processes for regression. Advances in neural information processing systems, 8, 1995.
- Just interpolate: Kernel “Ridgeless” regression can generalize. The Annals of Statistics, 48(3):1329 – 1347, 2020.
- The optimal ridge penalty for real-world high-dimensional data can be zero or negative due to the implicit ridge regularization. The Journal of Machine Learning Research, 21(1):6863–6878, 2020.
- Surprises in high-dimensional ridgeless least squares interpolation. The Annals of Statistics, 50(2):949–986, 2022.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020.
- On the surprising similarities between supervised and self-supervised models. In NeurIPS 2020 Workshop SVRHM, 2020.
- A self-supervised domain-general learning framework for human ventral stream representation. Nature communications, 13(1):491, 2022.
- Unsupervised neural network models of the ventral visual stream. Proceedings of the National Academy of Sciences, 118(3):e2014196118, 2021.