Do Large Language Models Mirror Cognitive Language Processing? (2402.18023v2)
Abstract: LLMs have demonstrated remarkable abilities in text comprehension and logical reasoning, indicating that the text representations learned by LLMs can facilitate their language processing capabilities. In cognitive science, brain cognitive processing signals are typically utilized to study human language processing. Therefore, it is natural to ask how well the text embeddings from LLMs align with the brain cognitive processing signals, and how training strategies affect the LLM-brain alignment? In this paper, we employ Representational Similarity Analysis (RSA) to measure the alignment between 23 mainstream LLMs and fMRI signals of the brain to evaluate how effectively LLMs simulate cognitive language processing. We empirically investigate the impact of various factors (e.g., pre-training data size, model scaling, alignment training, and prompts) on such LLM-brain alignment. Experimental results indicate that pre-training data size and model scaling are positively correlated with LLM-brain similarity, and alignment training can significantly improve LLM-brain similarity. Explicit prompts contribute to the consistency of LLMs with brain cognitive language processing, while nonsensical noisy prompts may attenuate such alignment. Additionally, the performance of a wide range of LLM evaluations (e.g., MMLU, Chatbot Arena) is highly correlated with the LLM-brain similarity.
- Bilingualism tunes the anterior cingulate cortex for conflict monitoring. Cerebral cortex, 22(9):2076–2086.
- Yonatan Belinkov. 2022. Probing classifiers: Promises, shortcomings, and advances. Comput. Linguistics, 48(1):207–219.
- Natural Language Processing with Python. O’Reilly.
- Palm: Scaling language modeling with pathways. J. Mach. Learn. Res., 24:240:1–240:113.
- Enhancing chat language models by scaling high-quality instructional conversations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pages 3029–3051. Association for Computational Linguistics.
- Allyson Ettinger. 2020. What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models. Trans. Assoc. Comput. Linguistics, 8:34–48.
- Barbara L Fredrickson. 2001. The role of positive emotions in positive psychology: The broaden-and-build theory of positive emotions. American psychologist, 56(3):218.
- Measuring massive multitask language understanding. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
- Cognival: A framework for cognitive word embedding evaluation. In Proceedings of the 23rd Conference on Computational Natural Language Learning, CoNLL 2019, Hong Kong, China, November 3-4, 2019, pages 538–549. Association for Computational Linguistics.
- Natural speech reveals the semantic maps that tile human cerebral cortex. Nature, 532(7600):453–458.
- Mistral 7b. arXiv preprint arXiv:2310.06825.
- Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images. NeuroImage, 197:368–382.
- Representational similarity analysis-connecting the branches of systems neuroscience. Frontiers in systems neuroscience, page 4.
- Alpacaeval: An automatic evaluator of instruction-following models.
- The benefits of frequent positive affect: Does happiness lead to success? Psychological bulletin, 131(6):803.
- Predicting human similarity judgments using large language models. CoRR, abs/2202.04728.
- Words are all you need? language as an approximation for human similarity judgments. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
- Predicting human brain activity associated with the meanings of nouns. science, 320(5880):1191–1195.
- OpenAI. 2023. Introducing chatgpt. web link.
- Aneta Pavlenko. 2005. Emotions and multilingualism. Cambridge University Press.
- The bilingual brain. proficiency and age of acquisition of the second language. Brain: a journal of neurology, 121(10):1841–1852.
- Toward a universal decoder of linguistic meaning from brain activation. Nature communications, 9(1):963.
- The neural architecture of language: Integrative modeling converges on predictive processing. Proceedings of the National Academy of Sciences, 118(45):e2105646118.
- Sebastian Schuster and Tal Linzen. 2022. When a sentence does not introduce a discourse entity, transformer-based models still sometimes refer to it. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022, pages 969–982. Association for Computational Linguistics.
- Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615.
- Mariya Toneva and Leila Wehbe. 2019. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain). In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 14928–14938.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Zephyr: Direct distillation of LM alignment. CoRR, abs/2310.16944.
- Strategies of discourse comprehension.
- Instructiongpt-4: A 200-instruction paradigm for fine-tuning minigpt-4. arXiv preprint arXiv:2308.12067.
- Structural supervision improves learning of non-local grammatical dependencies. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 3302–3312. Association for Computational Linguistics.
- Causal distillation for language models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022, pages 4288–4295. Association for Computational Linguistics.
- Brainbench: A brain-image test suite for distributional semantic models. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pages 2017–2021. The Association for Computational Linguistics.
- Hellaswag: Can a machine really finish your sentence? arXiv preprint arXiv:1905.07830.
- Judging llm-as-a-judge with mt-bench and chatbot arena. arXiv preprint arXiv:2306.05685.
- Yuqi Ren (6 papers)
- Renren Jin (17 papers)
- Tongxuan Zhang (3 papers)
- Deyi Xiong (103 papers)