Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

fMRI predictors based on language models of increasing complexity recover brain left lateralization (2405.17992v2)

Published 28 May 2024 in cs.CL, cs.AI, and q-bio.NC
fMRI predictors based on language models of increasing complexity recover brain left lateralization

Abstract: Over the past decade, studies of naturalistic language processing where participants are scanned while listening to continuous text have flourished. Using word embeddings at first, then LLMs, researchers have created encoding models to analyze the brain signals. Presenting these models with the same text as the participants allows to identify brain areas where there is a significant correlation between the functional magnetic resonance imaging (fMRI) time series and the ones predicted by the models' artificial neurons. One intriguing finding from these studies is that they have revealed highly symmetric bilateral activation patterns, somewhat at odds with the well-known left lateralization of language processing. Here, we report analyses of an fMRI dataset where we manipulate the complexity of LLMs, testing 28 pretrained models from 8 different families, ranging from 124M to 14.2B parameters. First, we observe that the performance of models in predicting brain responses follows a scaling law, where the fit with brain activity increases linearly with the logarithm of the number of parameters of the model (and its performance on natural language processing tasks). Second, although this effect is present in both hemispheres, it is stronger in the left than in the right hemisphere. Specifically, the left-right difference in brain correlation follows a scaling law with the number of parameters. This finding reconciles computational analyses of brain activity using LLMs with the classic observation from aphasic patients showing left hemisphere dominance for language.

The paper presents a sophisticated paper investigating the relationship between the complexity of LLMs and brain activity patterns measured through functional magnetic resonance imaging (fMRI). This work focuses on reconciling the somewhat conflicting findings between computational models and classical neuropsychological observations regarding the lateralization of language processing in the brain.

Key Findings and Methodology

  1. Naturalistic Language Processing and fMRI:
    • The paper builds on the growing body of research where participants are scanned while listening to continuous text.
    • Researchers have traditionally used word embeddings and LLMs to create encoding models, which are then presented with the same text as the participants. This allows for identifying brain areas where fMRI time series significantly correlate with the models' predicted neural activations.
  2. Bilateral Activation Patterns Versus Left Lateralization:
    • Previous studies with these models have shown symmetric bilateral activation patterns during language processing, which contrasts with the well-established left hemispheric dominance for language.
  3. Experiment with LLM Complexity:
    • The authors test 28 pretrained models from 8 different families, with sizes ranging from 124 million to 14.2 billion parameters.
    • They explore how the brain's response patterns change as a function of model complexity.
  4. Scaling Law and Model Performance:
    • The performance in predicting brain responses follows a scaling law: the fit between the models and brain activity increases linearly with the logarithm of the number of parameters in the model, which also correlates with the model's performance on natural language processing tasks.
    • This indicates that as LLMs become more sophisticated, they increasingly accurately reflect how the human brain processes language.

Emergence of Left-Right Asymmetry

One of the most significant findings of the paper is the emergence of left-right asymmetry in brain activations as model complexity increases:

  • Small Models: The smallest models do not show any significant asymmetry, meaning they do not preferentially fit either left or right hemispheric brain activations.
  • Larger Models: As model size increases, there is a noticeable pattern where larger models increasingly fit left hemispheric activations better than right hemispheric ones. This difference also adheres to a scaling law.

Implications

This research has notable implications:

  • Reconciliation with Neuropsychology: The findings provide a bridge between computational analyses using LLMs and classical neuropsychological observations (such as those derived from studies of aphasic patients) that demonstrate left hemisphere dominance for language.
  • Future Research Directions: The paper's results suggest that further increasing the complexity and capacity of LLMs could bring even more insights into the neural underpinnings of language processing, potentially informing both artificial intelligence and cognitive neuroscience fields.

The paper highlights how advancing technologies in LLMs can offer profound insights into human brain function, particularly in a domain as complex and uniquely human as language processing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (78)
  1. Scaling laws for language encoding models in fMRI. Advances in Neural Information Processing Systems, 36.
  2. Qwen technical report. arXiv preprint arXiv:2309.16609.
  3. Stable lm 2 1.6 b technical report. arXiv preprint arXiv:2402.17834.
  4. The neurobiology of semantic memory. Trends in Cognitive Sciences, 15(11):527–536.
  5. Determination of language dominance using functional MRI: a comparison with the Wada test. Neurology, 46(4):978–984.
  6. Bookheimer, S. (2002). Functional MRI of Language: New Approaches to Understanding the Cortical Organization of Semantic Processing. Annual Review of Neuroscience, 25(1):151–188.
  7. Measuring language lateralisation with different language tasks: a systematic review. PeerJ, 5:e3929.
  8. Broca, P. (1865). Sur le siège de la faculté du langage articulé. Bulletins et Mémoires de la Société d’Anthropologie de Paris, 6(1):377–393.
  9. Brains and algorithms partially converge in natural language processing. Communications biology, 5(1):1–10.
  10. Information flow across the cortical timescale hierarchy during narrative construction. Proceedings of the National Academy of Sciences, 119(51):e2209307119. Publisher: Proceedings of the National Academy of Sciences.
  11. Dax, M. D. (1865). Lésions de la moitié gauche de l’encéphale coïncidant avec l’oubli des signes de la pensée: Lu au Congrès méridional tenu à Montpellier en 1836, par le docteur Marc Dax. Gazette Hebdomadaire de Médecine et de Chirurgie, 17:259–260.
  12. The role of coherence and cohesion in text comprehension: an event-related fMRI study. Cognitive Brain Research, 11(3):325–340.
  13. Language after section of the cerebral commissures. Brain, 90(1):131–148. Publisher: Oxford University Press.
  14. Glover, G. H. (1999). Deconvolution of impulse response in event-related BOLD fMRI. Neuroimage, 9(4):416–429.
  15. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752.
  16. Language lateralisation measured across linguistic and national boundaries. Cortex, 111:134–147.
  17. The Hierarchical Cortical Organization of Human Speech Processing. Journal of Neuroscience, 37(27):6539–6557. Publisher: Society for Neuroscience Section: Research Articles.
  18. Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3):90–95.
  19. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature, 532(7600):453–458.
  20. Incorporating context into language encoding models for fmri. Advances in neural information processing systems, 31.
  21. Mistral 7B. arXiv preprint arXiv:2310.06825.
  22. Hemispheric specialization for language. Brain Research Reviews, 44(1):1–12.
  23. Jung-Beeman, M. (2005). Bilateral brain processes for comprehending natural language. Trends in Cognitive Sciences, 9(11):512–518.
  24. Brain activation modulated by sentence comprehension. Science, 274(5284):114–116.
  25. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361.
  26. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron, 98(3):630–644.
  27. The role of the angular gyrus in semantic cognition: a synthesis of five functional neuroimaging studies. Brain Structure and Function, 228(1):273–291.
  28. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological review, 104(2):211. Publisher: American Psychological Association.
  29. Topographic Mapping of a Hierarchy of Temporal Receptive Windows Using a Narrated Story. Journal of Neuroscience, 31(8):2906–2915.
  30. Le petit prince multilingual naturalistic fmri corpus. Scientific data, 9(1):530.
  31. An investigation across 45 languages and 12 language families reveals a universal language network. Nature Neuroscience, 25(8):1014–1019.
  32. Marc Dax and the discovery of the lateralisation of language in the left cerebral hemisphere. Revue Neurologique, 167(12):868–872.
  33. The Cortical Organization of Syntax. Cerebral Cortex, 30(3):1481–1498.
  34. McKinney, W. et al. (2010). Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, volume 445, pages 51–56. Austin, TX.
  35. Pointer sentinel mixture models. arXiv preprint arXiv:1609.07843.
  36. The neural basis of language development: Changes in lateralization over age. Proceedings of the National Academy of Sciences, 117(38):23477–23483.
  37. Precision fMRI reveals that the language network exhibits adult-like left-hemispheric lateralization by 4 years of age.
  38. Cortical representation of the constituent structure of sentences. Proceedings of the National Academy of Sciences, 108(6):2522–2527.
  39. Neural language models are not born equal to fit brain data, but training helps. arXiv preprint arXiv:2207.03380.
  40. Information-Restricted Neural Language Models Reveal Different Brain Regions’ Sensitivity to Semantics, Syntax, and Context. Neurobiology of Language, 4(4):611–636.
  41. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
  42. The Hub-and-Spoke Hypothesis of Semantic Memory. In Neurobiology of Language, pages 765–775. Elsevier.
  43. Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct):2825–2830.
  44. Speech and Brain Mechanisms. Princeton University Press, Princeton, NJ.
  45. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543.
  46. Amodal semantic representations depend on both anterior temporal lobes: Evidence from repetitive transcranial magnetic stimulation. Neuropsychologia, 48(5):1336–1342.
  47. What the Hand reveals about the Brain. MIT Press, Cambridge, MA.
  48. Poldrack, R. A. (2011). Inferring mental states from neuroimaging data: From reverse inference to large-scale decoding. Neuron, 72(5):692–697.
  49. Converging evidence for the neuroanatomic basis of combinatorial semantics in the angular gyrus. Journal of Neuroscience, 35(7):3276–3284.
  50. Pylkkänen, L. (2019). The neural basis of combinatory syntax and semantics. Science, 366(6461):62–66.
  51. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  52. The neural architecture of language: Integrative modeling converges on predictive processing. Proceedings of the National Academy of Sciences, 118(45).
  53. Brain-score: Which artificial neural network for object recognition is most brain-like? BioRxiv, page 407007.
  54. statsmodels: Econometric and statistical modeling with python. In 9th Python in Science Conference.
  55. Functional subdivisions in the left angular gyrus where the semantic system meets and diverges from the default network. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 30(50):16809–16817.
  56. Dynamic reconfiguration of the default mode network during narrative comprehension. Nature Communications, 7(1):12141. Number: 1 Publisher: Nature Publishing Group.
  57. Semantic dementia and the left and right temporal lobes. Cortex, 107:188–203.
  58. Localization of syntactic comprehension by positron emission tomography. Brain and language, 52(3):452–473.
  59. fMRI study of language lateralization in children and adults. Human brain mapping, 27(3):202–212.
  60. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295.
  61. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain). Advances in neural information processing systems, 32.
  62. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  63. Multi-factorial modulation of hemispheric specialization and plasticity for language in healthy and pathological conditions: A review. Cortex, 86:314–339.
  64. The numpy array: a structure for efficient numerical computation. Computing in Science & Engineering, 13(2):22.
  65. Attention is all you need. Advances in neural information processing systems, 30.
  66. What is right-hemisphere contribution to phonological, lexico-semantic, and sentence processing? NeuroImage, 54(1):577–593.
  67. Intracarotid Injection of Sodium Amytal for the Lateralization of Cerebral Speech Dominance: Experimental and Clinical Observations. Journal of Neurosurgery, 17(2):266–282. Publisher: Journal of Neurosurgery Publishing Group Section: Journal of Neurosurgery.
  68. Waskom, M. L. (2021). seaborn: statistical data visualization. Journal of Open Source Software, 6(60):3021.
  69. Wernicke, C. (1874). Der aphasische Symptomencomplex: eine psychologische Studie auf anatomischer Basis. Cohn & Weigert.
  70. Cerebral processing of linguistic and emotional prosody: fMRI studies. Progress in brain research, 156:249–268. Publisher: Elsevier.
  71. Left hemisphere specialization for language in the newborn: Neuroanatomical evidence of asymmetry. Brain, 96(3):641–646.
  72. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pages 38–45.
  73. Language cortex activation in normal children. Neurology, 63(6):1035–1044.
  74. Language in context: emergent features of word, sentence, and narrative comprehension. NeuroImage, 25(3):1002–1015.
  75. The neurobiological nature of syntactic hierarchies. Neuroscience & Biobehavioral Reviews.
  76. Reviewing the functional basis of the syntactic Merge mechanism for language: A coordinate-based activation likelihood estimation meta-analysis. Neuroscience & Biobehavioral Reviews, 80:646–656.
  77. Hellaswag: Can a machine really finish your sentence? arXiv preprint arXiv:1905.07830.
  78. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com