Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Brain-like Functional Organization within Large Language Models (2410.19542v2)

Published 25 Oct 2024 in q-bio.NC and cs.AI

Abstract: The human brain has long inspired the pursuit of AI. Recently, neuroimaging studies provide compelling evidence of alignment between the computational representation of artificial neural networks (ANNs) and the neural responses of the human brain to stimuli, suggesting that ANNs may employ brain-like information processing strategies. While such alignment has been observed across sensory modalities--visual, auditory, and linguistic--much of the focus has been on the behaviors of artificial neurons (ANs) at the population level, leaving the functional organization of individual ANs that facilitates such brain-like processes largely unexplored. In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs), the foundational organizational structure of the human brain. Specifically, we extract representative patterns from temporal responses of ANs in LLMs, and use them as fixed regressors to construct voxel-wise encoding models to predict brain activity recorded by functional magnetic resonance imaging (fMRI). This framework links the AN sub-groups to FBNs, enabling the delineation of brain-like functional organization within LLMs. Our findings reveal that LLMs (BERT and Llama 1-3) exhibit brain-like functional architecture, with sub-groups of artificial neurons mirroring the organizational patterns of well-established FBNs. Notably, the brain-like functional organization of LLMs evolves with the increased sophistication and capability, achieving an improved balance between the diversity of computational behaviors and the consistency of functional specializations. This research represents the first exploration of brain-like functional organization within LLMs, offering novel insights to inform the development of artificial general intelligence (AGI) with human brain principles.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Mostafa Abdou. Connecting neural response measurements & computational models of language: a non-comprehensive guide. arXiv preprint arXiv:2203.05300, 2022.
  2. Low-dimensional structure in the space of language representations is reflected in brain responses. Advances in Neural Information Processing Systems, 34:8332–8344, 2021.
  3. Scaling laws for language encoding models in fmri. Advances in Neural Information Processing Systems, 36, 2024.
  4. Understanding complexity in the human brain. Trends in cognitive sciences, 15(5):200–209, 2011.
  5. Danielle Smith Bassett and Ed Bullmore. Small-world brain networks. The neuroscientist, 12(6):512–523, 2006.
  6. Neuroscience: exploring the brain, enhanced edition: exploring the brain. Jones & Bartlett Learning, 2020.
  7. Functional mapping of the human visual cortex by magnetic resonance imaging. Science, 254(5032):716–719, 1991.
  8. Spatial and linguistic aspects of visual imagery in sentence comprehension. Cognitive Science, 31(5):733–764, 2007. doi: https://doi.org/10.1080/03640210701530748.
  9. Language models can explain neurons in language models. https://openaipublic.blob.core.windows.net/neuron-explainer/paper/index.html, 2023.
  10. Korbinian Brodmann. Vergleichende Lokalisationslehre der Grosshirnrinde in ihren Prinzipien dargestellt auf Grund des Zellenbaues. Barth, 1909.
  11. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS ’20, Red Hook, NY, USA, 2020. Curran Associates Inc. ISBN 9781713829546.
  12. Language processing in brains and deep neural networks: computational convergence and its limits. BioRxiv, pp.  2020–07, 2021.
  13. Brains and algorithms partially converge in natural language processing. Communications biology, 5(1):134, 2022.
  14. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  15. Alexey Dosovitskiy. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  16. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024.
  17. The language network as a natural kind within the broader landscape of the human brain. Nature Reviews Neuroscience, pp.  1–24, 2024.
  18. Angela D Friederici. The brain basis of language processing: from structure to function. Physiological reviews, 91(4):1357–1392, 2011.
  19. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  20. Brain network interactions in auditory, visual and linguistic processing. Brain and language, 89(2):377–384, 2004.
  21. Brain mechanisms of vision. Scientific American, 241(3):150–163, 1979.
  22. A network correspondence toolbox for quantitative evaluation of novel neuroimaging results, 2024. URL https://doi.org/10.1101/2024.06.17.599426.
  23. Nikolaus Kriegeskorte. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annual review of vision science, 1(1):417–446, 2015.
  24. Dissecting neural computations in the human auditory pathway using deep neural networks for speech. Nature Neuroscience, 26(12):2213–2225, 2023.
  25. Coupling artificial neurons in bert and biological neurons in the human brain. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.  8888–8896, 2023.
  26. Brain intelligence: go beyond artificial intelligence. Mobile Networks and Applications, 23:368–375, 2018.
  27. Online dictionary learning for sparse coding. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pp.  689–696, New York, NY, USA, 2009. Association for Computing Machinery. ISBN 9781605585161. doi: 10.1145/1553374.1553463. URL https://doi.org/10.1145/1553374.1553463.
  28. Toward a realistic model of speech processing in the brain with self-supervised learning. Advances in Neural Information Processing Systems, 35:33428–33443, 2022.
  29. The “narratives” fmri dataset for evaluating models of naturalistic language comprehension. Scientific data, 8(1):1–22, 2021.
  30. Nils J Nilsson. The quest for artificial intelligence. Cambridge University Press, 2009.
  31. Joint processing of linguistic properties in brains and language models. Advances in Neural Information Processing Systems, 36, 2024.
  32. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.
  33. Structural and functional brain networks: From connections to cognition. Science, 342(6158):579, 2013.
  34. Functional network organization of the human brain. Neuron, 72(4):665–678, 2011.
  35. Explaining deep neural networks and beyond: A review of methods and applications. Proceedings of the IEEE, 109(3):247–278, 2021. doi: 10.1109/JPROC.2021.3060483.
  36. Frequency-specific directed interactions in the human brain network for language. Proceedings of the National Academy of Sciences, 114(30):8083–8088, 2017.
  37. The neural architecture of language: Integrative modeling converges on predictive processing. Proceedings of the National Academy of Sciences, 118(45):e2105646118, 2021.
  38. Explaining black box text modules in natural language with language models. arXiv preprint arXiv:2305.09863, 2023.
  39. Correspondence of the brain’s functional architecture during activation and rest. Proceedings of the national academy of sciences, 106(31):13040–13045, 2009.
  40. The role of visual imagery in story reading: Evidence from aphantasia. Consciousness and Cognition, 118:103645, 2024. ISSN 1053-8100. doi: https://doi.org/10.1016/j.concog.2024.103645.
  41. Connectivity and complexity: the relationship between neuroanatomy and brain dynamics. Neural networks, 13(8-9):909–922, 2000.
  42. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
  43. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023b.
  44. Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions. PLOS Biology, 21(12):1–70, 12 2023. doi: 10.1371/journal.pbio.3002366.
  45. Ashish Vaswani. Attention is all you need. Advances in Neural Information Processing Systems, 2017.
  46. Semattack: Natural textual attacks via different semantic spaces. arXiv preprint arXiv:2205.01287, 2022.
  47. Analyzing chain-of-thought prompting in large language models via gradient-based feature attributions. arXiv preprint arXiv:2307.13339, 2023.
  48. Using goal-driven deep learning models to understand sensory cortex. Nature neuroscience, 19(3):356–365, 2016.
  49. Attentionviz: A global view of transformer attention. IEEE Transactions on Visualization and Computer Graphics, 2023.
  50. The organization of the human cerebral cortex estimated by intrinsic functional connectivity. Journal of neurophysiology, 2011.
  51. The default mode network: where the idiosyncratic self meets the shared social world. Nature Reviews Neuroscience, 22(3):181–192, 2021.
  52. Probing gpt-3’s linguistic knowledge on semantic tasks. In Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pp.  297–304, 2022.
  53. Explainability for large language models: A survey. ACM Transactions on Intelligent Systems and Technology, 15(2):1–38, 2024.
  54. Coupling visual semantics of artificial neural networks and human brain function via synchronized activations. IEEE Transactions on Cognitive and Developmental Systems, 16(2):584–594, 2023a.
  55. When brain-inspired ai meets agi. Meta-Radiology, pp.  100005, 2023b.
  56. Fine-grained artificial neurons in audio-transformers for disentangling neural auditory encoding. In Findings of the Association for Computational Linguistics: ACL 2023, pp.  7943–7956, 2023.
  57. Rolf A Zwaan. The immersed experiencer: Toward an embodied theory of language comprehension. Psychology of Learning and Motivation, 44:35–62, 2003. ISSN 0079-7421. doi: https://doi.org/10.1016/S0079-7421(03)44002-4.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Haiyang Sun (45 papers)
  2. Lin Zhao (228 papers)
  3. Zihao Wu (100 papers)
  4. Xiaohui Gao (16 papers)
  5. Yutao Hu (19 papers)
  6. Mengfei Zuo (1 paper)
  7. Wei Zhang (1489 papers)
  8. Junwei Han (87 papers)
  9. Tianming Liu (161 papers)
  10. Xintao Hu (19 papers)

Summary

Brain-like Functional Organization within LLMs

In the paper titled "Brain-like Functional Organization within LLMs," the authors explore the intricate alignment between artificial neural networks (ANNs) and functional brain networks (FBNs). The paper focuses on LLMs such as BERT and the Llama family (Llama 1-3), which have increasingly become a focal area of AI research due to their impressive performance in natural language processing tasks. The work attempts to bridge the existing gap in understanding the individual roles of artificial neurons (ANs) by linking sub-groups of ANs to FBNs through a novel encoding model.

The research employs a methodical approach by extracting representative patterns from the temporal responses of ANs in LLMs and using these patterns as fixed regressors in voxel-wise encoding models to predict brain activity, as recorded by functional magnetic resonance imaging (fMRI). Such an approach establishes a direct linkage between AN sub-groups and brain activities. This connection allows the authors to hypothesize a brain-like functional organization within LLMs, distinguishing their work from prior studies that largely concentrated on population-level behaviors of ANs.

Methodological Overview

The paper employs a sparse representation framework to identify and use representative temporal response patterns from ANs across different models. This approach not only simplifies the analysis of vast numbers of ANs but also ensures that only key patterns are identified, avoiding the pitfalls of noise and redundancy. The voxel-wise encoding models deployed in this paper efficiently couple the ANs' activities to specific regions in the brain, identified through fMRI, thus facilitating a detailed examination of AN functionalities in relation to well-established FBNs.

Key Findings and Numerical Results

A notable finding from the results is the high variability in the extent to which different FBNs are involved with various brain maps across LLMs, particularly in models like Llama3. The examined LLMs consistently showed engagement with a core set of FBNs, including the lateral visual cortex, language network, default mode network, among others. The brain maps reveal cooperative interactions among these networks, emphasizing a multi-network involvement that mirrors neural processing in human brains. Interestingly, the paper concludes that more advanced LLMs, such as Llama3, achieve a better-balanced functional organization that allows these models to manage a greater diversity of computational behaviors while ensuring consistent specialization functions.

The data underscores consistent anatomical alignment of certain atoms among various LLMs, suggesting potential shared latent functions that may underpin language processing and semantic understanding. Such consistency supports the hypothesis that LLMs exhibit evolutionarily linked brain-like organizational principles, particularly advancing as the models themselves become more sophisticated.

Theoretical and Practical Implications

This research implies that as LLMs become increasingly sophisticated, their architecture parallels the human brain more closely in terms of functional specialization and diverse computational behavior management. The findings could be transformative in guiding future development of AI systems inspired by neural organizations, potentially contributing to the development of artificial general intelligence (AGI) that reflects human cognitive heuristics.

From a practical perspective, such alignment might inform the creation of more efficient neural architectures that emulate biological processing efficiencies, resulting in models that not only perform better but also explain decisions in a manner similar to human reasoning.

Speculative Future Directions

Future inquiries might profitably explore diverse dictionary sizes tailored to each model’s specific neuronal complexity, providing potentially finer insights into the dynamic interplay of ANs and brain networks. Additional studies could validate these findings across broader datasets to ensure robustness and reproducibility. Moreover, transcending linguistic modalities, applying similar frameworks to models in other cognitive domains could offer additional layers of validation for these findings.

In summary, this paper provides a robust analysis of brain-like functional organization within LLMs, proposing a novel alignment with human neural architecture principles that may critically inform the trajectory towards developing AI systems with true brain-like intelligence capabilities.