Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MulCogBench: A Multi-modal Cognitive Benchmark Dataset for Evaluating Chinese and English Computational Language Models (2403.01116v1)

Published 2 Mar 2024 in cs.CL

Abstract: Pre-trained computational LLMs have recently made remarkable progress in harnessing the language abilities which were considered unique to humans. Their success has raised interest in whether these models represent and process language like humans. To answer this question, this paper proposes MulCogBench, a multi-modal cognitive benchmark dataset collected from native Chinese and English participants. It encompasses a variety of cognitive data, including subjective semantic ratings, eye-tracking, functional magnetic resonance imaging (fMRI), and magnetoencephalography (MEG). To assess the relationship between LLMs and cognitive data, we conducted a similarity-encoding analysis which decodes cognitive data based on its pattern similarity with textual embeddings. Results show that LLMs share significant similarities with human cognitive data and the similarity patterns are modulated by the data modality and stimuli complexity. Specifically, context-aware models outperform context-independent models as language stimulus complexity increases. The shallow layers of context-aware models are better aligned with the high-temporal-resolution MEG signals whereas the deeper layers show more similarity with the high-spatial-resolution fMRI. These results indicate that LLMs have a delicate relationship with brain language representations. Moreover, the results between Chinese and English are highly consistent, suggesting the generalizability of these findings across languages.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Of words, eyes and brains: Correlating image-based distributional semantic models with neural representations of concepts. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1960–1970, Seattle, Washington, USA. Association for Computational Linguistics.
  2. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. ArXiv, abs/2302.04023.
  3. A data-driven framework for mapping domains of human neurobiology. Nature neuroscience, 24(12):1733–1744.
  4. Toward a brain-based componential semantic representation. Cognitive neuropsychology, 33(3-4):130–174.
  5. Idan Asher Blank. 2023. What are large language models supposed to model? Trends in Cognitive Sciences, 27:987–989.
  6. Charlotte Caucheteux and Jean-Rémi King. 2022. Brains and algorithms partially converge in natural language processing. Communications biology, 5(1):134.
  7. Revisiting pre-trained models for Chinese natural language processing. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 657–668, Online. Association for Computational Linguistics.
  8. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  9. Cortical tracking of hierarchical linguistic structures in connected speech. Nature neuroscience, 19:158 – 164.
  10. fmriprep: a robust preprocessing pipeline for functional mri. Nature methods, 16(1):111–116.
  11. Jon Gauthier and Roger Levy. 2019. Linking artificial and human neural representations of language. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 529–539, Hong Kong, China. Association for Computational Linguistics.
  12. The minimal preprocessing pipelines for the human connectome project. NeuroImage, 80:105–124.
  13. From language to language-ish: How brain-like is an LSTM’s representation of nonsensical language stimuli? In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 645–656, Online. Association for Computational Linguistics.
  14. CogniVal: A framework for cognitive word embedding evaluation. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pages 538–549, Hong Kong, China. Association for Computational Linguistics.
  15. Zuco, a simultaneous eeg and eye-tracking resource for natural sentence reading. Scientific data, 5(1):1–13.
  16. ZuCo 2.0: A dataset of physiological recordings during natural reading and annotation. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 138–146, Marseille, France. European Language Resources Association.
  17. Jie Huang and Kevin Chen-Chuan Chang. 2022. Towards reasoning in large language models: A survey. ArXiv, abs/2212.10403.
  18. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
  19. Neural language models are not born equal to fit brain data, but training helps. arXiv preprint arXiv:2207.03380.
  20. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar. Association for Computational Linguistics.
  21. Toward a universal decoder of linguistic meaning from brain activation. Nature communications, 9(1):1–13.
  22. Language models are unsupervised multitask learners. OpenAI blog.
  23. Keith Rayner. 1998. Eye movements in reading and information processing: 20 years of research. Psychological bulletin, 124(3):372.
  24. Keith Rayner. 2009. The 35th sir frederick bartlett lecture: Eye movements and attention in reading, scene perception, and visual search. Quarterly Journal of Experimental Psychology, 62(8):1457–1506.
  25. A synchronized multimodal neuroimaging dataset for studying brain language processing. Scientific Data, 9.
  26. An fmri dataset for concept representation with semantic feature annotations. Scientific Data, 9.
  27. BrainBench: A brain-image test suite for distributional semantic models. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2017–2021, Austin, Texas. Association for Computational Linguistics.
  28. The database of eye-movement measures on words in chinese reading. Scientific Data, 9(1):1–8.
  29. Connecting concepts in the brain by mapping cortical representations of semantic relations. Nature Communications, 11(1):1877.
  30. A survey of large language models. ArXiv, abs/2303.18223.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yunhao Zhang (19 papers)
  2. Xiaohan Zhang (79 papers)
  3. Chong Li (112 papers)
  4. Shaonan Wang (19 papers)
  5. Chengqing Zong (65 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.