Evaluating alignment between humans and neural network representations in image-based learning tasks (2306.09377v2)
Abstract: Humans represent scenes and objects in rich feature spaces, carrying information that allows us to generalise about category memberships and abstract functions with few examples. What determines whether a neural network model generalises like a human? We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories across two tasks where humans had to learn continuous relationships and categories of natural images. In these tasks, both human participants and neural networks successfully identified the relevant stimulus features within a few trials, demonstrating effective generalisation. We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation. Intrinsic dimensionality of representations had different effects on alignment for different model types. Lastly, we tested three sets of human-aligned representations and found no consistent improvements in predictive accuracy compared to the baselines. In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks. Both our paradigms and modelling approach offer a novel way to quantify alignment between neural networks and humans and extend cognitive science into more naturalistic domains.
- Revealing the multidimensional mental representations of natural objects underlying human similarity judgements. Nature Human Behaviour, 4(11):1173–1185, November 2020. ISSN 2397-3374. 10.1038/s41562-020-00951-3. URL https://www.nature.com/articles/s41562-020-00951-3. Number: 11 Publisher: Nature Publishing Group.
- THINGSplus: New Norms and Metadata for the THINGS Database of 1,854 Object Concepts and 26,107 Natural Object Images, July 2022. URL https://psyarxiv.com/exu9f/.
- Learning and memorization of classifications. Psychological Monographs: General and Applied, 75:1–42, 1961. ISSN 0096-9753. 10.1037/h0093825. Place: US Publisher: American Psychological Association.
- Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms. Journal of Neuroscience, 35(21):8145–8157, May 2015. ISSN 0270-6474, 1529-2401. 10.1523/JNEUROSCI.2978-14.2015. URL https://www.jneurosci.org/content/35/21/8145. Publisher: Society for Neuroscience Section: Articles.
- Hippocampal spatio-predictive cognitive maps adaptively guide reward generalization. Nature Neuroscience, 26(4):615–626, April 2023. ISSN 1546-1726. 10.1038/s41593-023-01283-x. URL https://www.nature.com/articles/s41593-023-01283-x. Number: 4 Publisher: Nature Publishing Group.
- J. Douglas Carroll. Functional Learning: The Learning of Continuous Functional Mappings Relating Stimulus and Response Continua. ETS Research Bulletin Series, 1963(2):i–144, 1963. ISSN 2333-8504. 10.1002/j.2333-8504.1963.tb00958.x. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/j.2333-8504.1963.tb00958.x. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/j.2333-8504.1963.tb00958.x.
- Extrapolation: The sine qua non for abstraction in function learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23:968–986, 1997. ISSN 1939-1285. 10.1037/0278-7393.23.4.968. Place: US Publisher: American Psychological Association.
- Heuristics from bounded meta-learned inference. Psychological Review, 129(5):1042–1077, October 2022. ISSN 1939-1471. 10.1037/rev0000330.
- Generalization guides human exploration in vast decision spaces. Nature human behaviour, 2(12):915–924, 2018.
- THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images. PLOS ONE, 14(10):e0223792, October 2019. ISSN 1932-6203. 10.1371/journal.pone.0223792. URL https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0223792. Publisher: Public Library of Science.
- Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation. PLOS Computational Biology, 10(11):e1003915, November 2014. ISSN 1553-7358. 10.1371/journal.pcbi.1003915. URL https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003915. Publisher: Public Library of Science.
- Unsupervised neural network models of the ventral visual stream. Proceedings of the National Academy of Sciences, 118(3):e2014196118, January 2021. 10.1073/pnas.2014196118. URL https://www.pnas.org/doi/10.1073/pnas.2014196118. Publisher: Proceedings of the National Academy of Sciences.
- Capturing human categorization of natural images by combining deep networks and cognitive models. Nature Communications, 11(1):5418, October 2020. ISSN 2041-1723. 10.1038/s41467-020-18946-z. URL https://www.nature.com/articles/s41467-020-18946-z. Number: 1 Publisher: Nature Publishing Group.
- Evaluating (and Improving) the Correspondence Between Deep Neural Networks and Human Representations. Cognitive Science, 42(8):2648–2669, November 2018. ISSN 1551-6709. 10.1111/cogs.12670.
- Many but not all deep neural network audio models capture brain responses and exhibit hierarchical region correspondence, November 2022. URL https://www.biorxiv.org/content/10.1101/2022.09.06.506680v3. Pages: 2022.09.06.506680 Section: New Results.
- The neural architecture of language: Integrative modeling converges on predictive processing. Proceedings of the National Academy of Sciences, 118(45):e2105646118, November 2021. 10.1073/pnas.2105646118. URL https://www.pnas.org/doi/10.1073/pnas.2105646118. Publisher: Proceedings of the National Academy of Sciences.
- Learning Transferable Visual Models From Natural Language Supervision, February 2021. URL http://arxiv.org/abs/2103.00020. arXiv:2103.00020 [cs].
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, June 2021. URL http://arxiv.org/abs/2010.11929. arXiv:2010.11929 [cs].
- RoBERTa: A Robustly Optimized BERT Pretraining Approach, July 2019. URL http://arxiv.org/abs/1907.11692. arXiv:1907.11692 [cs].
- Universal Sentence Encoder, April 2018. URL http://arxiv.org/abs/1803.11175. arXiv:1803.11175 [cs].
- Advances in Pre-Training Distributed Word Representations, December 2017. URL http://arxiv.org/abs/1712.09405. arXiv:1712.09405 [cs].
- Deep Residual Learning for Image Recognition, December 2015. URL http://arxiv.org/abs/1512.03385. arXiv:1512.03385 [cs].
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, May 2019. URL http://arxiv.org/abs/1810.04805. arXiv:1810.04805 [cs].
- Emerging Properties in Self-Supervised Vision Transformers, May 2021a. URL http://arxiv.org/abs/2104.14294. arXiv:2104.14294 [cs].
- Improved Baselines with Momentum Contrastive Learning, March 2020a. URL http://arxiv.org/abs/2003.04297. arXiv:2003.04297 [cs].
- A ConvNet for the 2020s, March 2022. URL http://arxiv.org/abs/2201.03545. arXiv:2201.03545 [cs].
- Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles, August 2017. URL http://arxiv.org/abs/1603.09246. arXiv:1603.09246 [cs].
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, August 2021. URL http://arxiv.org/abs/2103.14030. arXiv:2103.14030 [cs].
- CORnet: Modeling the Neural Mechanisms of Core Object Recognition, September 2018. URL https://www.biorxiv.org/content/10.1101/408385v1. Pages: 408385 Section: New Results.
- Unsupervised Representation Learning by Predicting Image Rotations, March 2018. URL http://arxiv.org/abs/1803.07728. arXiv:1803.07728 [cs].
- Barlow Twins: Self-Supervised Learning via Redundancy Reduction, June 2021. URL http://arxiv.org/abs/2103.03230. arXiv:2103.03230 [cs, q-bio].
- VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning, January 2022. URL http://arxiv.org/abs/2105.04906. arXiv:2105.04906 [cs].
- DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, February 2020. URL http://arxiv.org/abs/1910.01108. arXiv:1910.01108 [cs].
- Harmonizing the object recognition strategies of deep neural networks with humans, November 2022. URL http://arxiv.org/abs/2211.04533. arXiv:2211.04533 [cs].
- Language Models are Few-Shot Learners, July 2020. URL http://arxiv.org/abs/2005.14165. arXiv:2005.14165 [cs].
- Ishan Misra and Laurens van der Maaten. Self-Supervised Learning of Pretext-Invariant Representations, December 2019. URL http://arxiv.org/abs/1912.01991. arXiv:1912.01991 [cs].
- Unsupervised Learning of Visual Features by Contrasting Cluster Assignments, January 2021b. URL http://arxiv.org/abs/2006.09882. arXiv:2006.09882 [cs].
- A Simple Framework for Contrastive Learning of Visual Representations, June 2020b. URL http://arxiv.org/abs/2002.05709. arXiv:2002.05709 [cs, stat].
- THINGSvision: A Python Toolbox for Streamlining the Extraction of Activations From Deep Neural Networks. Frontiers in Neuroinformatics, 15, 2021. ISSN 1662-5196. URL https://www.frontiersin.org/articles/10.3389/fninf.2021.679838.
- Representational similarity analysis - connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2, 2008. ISSN 1662-5137. URL https://www.frontiersin.org/articles/10.3389/neuro.06.004.2008.
- Similarity of Neural Network Representations Revisited, July 2019. URL http://arxiv.org/abs/1905.00414. arXiv:1905.00414 [cs, q-bio, stat].
- Human alignment of neural network representations, April 2023. URL http://arxiv.org/abs/2211.01201. arXiv:2211.01201 [cs, q-bio].
- Partial success in closing the gap between human and machine vision, October 2021. URL http://arxiv.org/abs/2106.07411. arXiv:2106.07411 [cs, q-bio].
- Passive attention in artificial neural networks predicts human visual selectivity. In Advances in Neural Information Processing Systems, volume 34, pages 27094–27106. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper/2021/hash/e360367584297ee8d2d5afa709cd440e-Abstract.html.
- A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy. Neuron, 98(3):630–644.e16, May 2018. ISSN 0896-6273. 10.1016/j.neuron.2018.03.044. URL https://www.cell.com/neuron/abstract/S0896-6273(18)30250-2. Publisher: Elsevier.
- Lera Boroditsky. How Language Shapes Thought. Scientific American, 304(2):62–65, 2011. ISSN 0036-8733. URL https://www.jstor.org/stable/26002395. Publisher: Scientific American, a division of Nature America, Inc.
- Can language restructure cognition? The case for space. Trends in Cognitive Sciences, 8(3):108–114, March 2004. ISSN 1364-6613. 10.1016/j.tics.2004.01.003. URL https://www.sciencedirect.com/science/article/pii/S1364661304000208.
- Felice van ’t Wout and Christopher Jarrold. The role of language in novel task learning. Cognition, 194:104036, January 2020. ISSN 0010-0277. 10.1016/j.cognition.2019.104036. URL https://www.sciencedirect.com/science/article/pii/S0010027719302094.
- Frames, Biases, and Rational Decision-Making in the Human Brain. Science (New York, N.Y.), 313(5787):684–687, August 2006. ISSN 0036-8075. 10.1126/science.1128356. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2631940/.
- Learning in a changing environment. Journal of Experimental Psychology: General, 139:266–298, 2010. ISSN 1939-2222. 10.1037/a0018620. Place: US Publisher: American Psychological Association.
- Exploration Beyond Bandits. In Irene Cogliati Dezza, Eric Schulz, and Charley M. Wu, editors, The Drive for Knowledge: The Science of Human Information Seeking, pages 147–168. Cambridge University Press, Cambridge, 2022. ISBN 978-1-316-51590-7. 10.1017/9781009026949.008. URL https://www.cambridge.org/core/books/drive-for-knowledge/exploration-beyond-bandits/F5A30142D51738132E582FDA06C3CD7D.
- Putting bandits into context: How function learning supports decision making. Journal of Experimental Psychology. Learning, Memory, and Cognition, 44(6):927–943, June 2018. ISSN 1939-1285. 10.1037/xlm0000463.
- Probing the Compositionality of Intuitive Functions. In Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016. URL https://papers.nips.cc/paper_files/paper/2016/hash/49ad23d1ec9fa4bd8d77d02681df5cfa-Abstract.html.
- Generative timbre spaces: regularizing variational auto-encoders with perceptual metrics, October 2018. URL http://arxiv.org/abs/1805.08501. arXiv:1805.08501 [cs, eess].
- Supervised Contrastive Learning. In Advances in Neural Information Processing Systems, volume 33, pages 18661–18673. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/d89a66c7c80a29b1bdbab0f2a1a94af8-Abstract.html.
- The Unreasonable Effectiveness of Deep Features as a Perceptual Metric, April 2018. URL http://arxiv.org/abs/1801.03924. arXiv:1801.03924 [cs].
- Enriching ImageNet with Human Similarity Judgments and Psychological Embeddings, November 2020. URL http://arxiv.org/abs/2011.11015. arXiv:2011.11015 [cs].
- Structured, uncertainty-driven exploration in real-world consumer choice. Proceedings of the National Academy of Sciences, 116(28):13903–13908, July 2019. 10.1073/pnas.1821028116. URL https://www.pnas.org/doi/10.1073/pnas.1821028116. Publisher: Proceedings of the National Academy of Sciences.
- Naturalistic multiattribute choice. Cognition, 179:71–88, October 2018. ISSN 0010-0277. 10.1016/j.cognition.2018.05.025. URL https://www.sciencedirect.com/science/article/pii/S0010027718301513.
- Joshua R. de Leeuw. jsPsych: A JavaScript library for creating behavioral experiments in a Web browser. Behavior Research Methods, 47(1):1–12, March 2015. ISSN 1554-3528. 10.3758/s13428-014-0458-y. URL https://doi.org/10.3758/s13428-014-0458-y.
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12(85):2825–2830, 2011. ISSN 1533-7928. URL http://jmlr.org/papers/v12/pedregosa11a.html.
- Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67:1–48, October 2015. ISSN 1548-7660. 10.18637/jss.v067.i01. URL https://doi.org/10.18637/jss.v067.i01.
- A New View of Automatic Relevance Determination. In Advances in Neural Information Processing Systems, volume 20. Curran Associates, Inc., 2007. URL https://papers.nips.cc/paper_files/paper/2007/hash/9c01802ddb981e6bcfbec0f0516b8e35-Abstract.html.
- Attention Is All You Need, December 2017. URL http://arxiv.org/abs/1706.03762. arXiv:1706.03762 [cs].
- ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, June 2009. 10.1109/CVPR.2009.5206848. ISSN: 1063-6919.
- Learning what and where to attend, June 2019. URL http://arxiv.org/abs/1805.08819. arXiv:1805.08819 [cs].
- Leo Gao. On the Sizes of OpenAI API Models, May 2021. URL https://blog.eleuther.ai/gpt3-model-sizes/.
- Can Demircan (3 papers)
- Tankred Saanum (7 papers)
- Leonardo Pettini (2 papers)
- Marcel Binz (30 papers)
- Blazej M Baczkowski (1 paper)
- Christian F Doeller (2 papers)
- Mona M Garvert (1 paper)
- Eric Schulz (33 papers)