Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Instruction-tuning Aligns LLMs to the Human Brain (2312.00575v2)

Published 1 Dec 2023 in cs.CL
Instruction-tuning Aligns LLMs to the Human Brain

Abstract: Instruction-tuning is a widely adopted finetuning method that enables LLMs to generate output that more closely resembles human responses. However, no studies have shown that instruction-tuning actually teaches LLMs to process language in a similar manner as humans. We investigate the effect of instruction-tuning on aligning LLM and human language processing mechanisms in two ways: (1) brain alignment, the similarity of LLM internal representations to neural activity in the human language system, and (2) behavioral alignment, the similarity of LLM and human behavior on a reading task. We assess 25 vanilla and instruction-tuned LLMs on three datasets involving humans reading naturalistic stories and sentences, and find that instruction-tuning generally enhances brain alignment (~6%), but has no similar effect on behavioral alignment. To identify factors underlying this improvement in brain alignment, we compute correlations between brain alignment and various LLM properties, such as model size, problem-solving, and world knowledge understanding. Notably, we find a strong positive correlation between brain alignment and model size (r = 0.95), as well as performance on tasks requiring world knowledge (r = 0.81). Our results demonstrate that instruction-tuning LLMs improves both world knowledge representations and brain alignment, suggesting that the mechanisms that encode world knowledge in LLMs also improve representational alignment to the human brain.

Instruction-tuning Aligns LLMs to the Human Brain

The paper by Khai Loong Aw et al. presents an investigation into the effects of instruction-tuning on the representational similarity between LLMs and the human brain language system. The primary focus is to determine whether instruction-tuning, a prevalent fine-tuning approach for LLMs, enhances their alignment with human neural activity and behavior.

The authors evaluate LLM-human similarity on two fronts: brain alignment and behavioral alignment. Brain alignment is measured as the correspondence between LLM internal representations and neural activity within the human language system, whereas behavioral alignment assesses the parallelism between LLM outputs and human behavior on a reading task.

Several key findings are documented in the paper:

  1. Improvement in Brain Alignment: Instruction-tuning enhances brain alignment by an average of 6.2% across evaluated datasets, as measured by the Brain-Score metric. This suggests that instruction-tuning mechanisms that encapsulate world knowledge also boost the representational alignment to human brain activity. Noteworthily, the paper finds a high correlation between brain alignment and both model size (r = 0.95) and task performance necessitating world knowledge (r = 0.81).
  2. Model Properties and Alignment: The research highlights that world knowledge and model size are strongly correlated with brain alignment. The authors analyze correlations using performance scores from two benchmarks: the Massive Multi-task Language Understanding (MMLU) for world knowledge, and the Big-Bench Hard (BBH) for problem-solving abilities. The correlation results indicate that possessing expansive world knowledge is significant for aligning LLMs with human brain activity.
  3. Contrasting Behavioral Alignment: Interestingly, instruction-tuning shows negligible enhancement in behavioral alignment, as assessed by comparing LLM perplexity with human reading times. This indicates that while instruction-tuning aligns the internal model representations more closely to brain activity, it does not translate to a comparable alignment in behavioral measures.

The implications of this research are manifold for both NLP and neuroscience. From an NLP perspective, the improvement in brain alignment through instruction-tuning suggests an approach for developing LLMs that potentially leverage neural alignment evaluations to build models with enhanced performance on complex, knowledge-dependent tasks. For neuroscience, this paper provides insights into how world knowledge might structurally shape neural activity patterns related to language comprehension and representation.

Several limitations and avenues for future work are acknowledged. The paper's computational demands are highlighted due to the extensive evaluation necessary across multiple models and dimensions. Additionally, the exploration of LLM-human alignment was constrained predominantly to language input tasks, suggesting that future work could expand to diverse cognitive tasks to better understand the correspondence between human neural processes and LLM representations.

In conclusion, the paper offers a significant contribution by demonstrating that instruction-tuning enhances the representational alignment of LLMs with human brain activity, primarily through improved world knowledge encoding. This work paves a meaningful path for future research at the intersection of computational LLMs and human neuroscience.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. A Review on Language Models as Knowledge Bases, April 2022. URL http://arxiv.org/abs/2204.06031. arXiv:2204.06031 [cs].
  2. Gpt4all: Training an assistant-style chatbot with large scale data distillation from gpt-3.5-turbo. https://github.com/nomic-ai/gpt4all, 2023.
  3. Scaling laws for language encoding models in fMRI, May 2023. URL http://arxiv.org/abs/2305.11863. arXiv:2305.11863 [cs].
  4. Training language models to summarize narratives improves brain alignment, February 2023. URL http://arxiv.org/abs/2212.10898. arXiv:2212.10898 [cs, q-bio].
  5. A functional dissociation between language and multiple-demand systems revealed in patterns of BOLD signal fluctuations. Journal of Neurophysiology, 112(5):1105–1118, September 2014. ISSN 0022-3077, 1522-1598. doi: 10.1152/jn.00884.2013. URL https://www.physiology.org/doi/10.1152/jn.00884.2013.
  6. Comet: Commonsense transformers for automatic knowledge graph construction. In Annual Meeting of the Association for Computational Linguistics, 2019. URL https://api.semanticscholar.org/CorpusID:189762527.
  7. Brains and algorithms partially converge in natural language processing. Communications Biology, 5(1):134, February 2022. ISSN 2399-3642. doi: 10.1038/s42003-022-03036-1. URL https://www.nature.com/articles/s42003-022-03036-1.
  8. INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models, June 2023. URL http://arxiv.org/abs/2306.04757. arXiv:2306.04757 [cs].
  9. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, March 2023. URL https://lmsys.org/blog/2023-03-30-vicuna/.
  10. Scaling Instruction-Finetuned Language Models, December 2022. URL http://arxiv.org/abs/2210.11416. arXiv:2210.11416 [cs].
  11. Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image Perturbations. In Neural Information Processing Systems (NeurIPS), June 2020. doi: 10.1101/2020.06.16.154542.
  12. Aligning Model and Macaque Inferior Temporal Cortex Representations Improves Model-to-Human Behavioral Alignment and Adversarial Robustness. preprint, Neuroscience, July 2022. URL http://biorxiv.org/lookup/doi/10.1101/2022.07.01.498495.
  13. Language models show human-like content effects on reasoning, July 2022. URL http://arxiv.org/abs/2207.07051. arXiv:2207.07051 [cs].
  14. Interpreting multimodal video transformers using brain recordings. In ICLR 2023 Workshop on Multimodal Representation Learning: Perks and Pitfalls, 2023. URL https://openreview.net/forum?id=p-vL3rmYoqh.
  15. Artificial neural network language models predict human brain responses to language even after a developmentally realistic amount of training. bioRxiv, pp.  2022.10.04.510681, January 2023. doi: 10.1101/2022.10.04.510681. URL http://biorxiv.org/content/early/2023/09/19/2022.10.04.510681.abstract.
  16. Contextual effects on word perception and eye movements during reading. Journal of Verbal Learning and Verbal Behavior, 20(6):641–655, December 1981. ISSN 00225371. doi: 10.1016/S0022-5371(81)90220-6. URL https://linkinghub.elsevier.com/retrieve/pii/S0022537181902206.
  17. The natural stories corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018. European Language Resources Association (ELRA). URL https://aclanthology.org/L18-1012.
  18. Shared computational principles for language processing in humans and deep language models. Nature Neuroscience, 25(3):369–380, March 2022. ISSN 1097-6256, 1546-1726. doi: 10.1038/s41593-022-01026-4. URL https://www.nature.com/articles/s41593-022-01026-4.
  19. John Hale. A probabilistic earley parser as a psycholinguistic model. In Second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies 2001 - NAACL ’01, pp.  1–8, Pittsburgh, Pennsylvania, 2001. Association for Computational Linguistics. doi: 10.3115/1073336.1073357. URL http://portal.acm.org/citation.cfm?doid=1073336.1073357.
  20. Measuring Massive Multitask Language Understanding, January 2021. URL http://arxiv.org/abs/2009.03300. arXiv:2009.03300 [cs].
  21. Comprehension of computer code relies primarily on domain-general executive brain regions. eLife, 9:e58906, December 2020. ISSN 2050-084X. doi: 10.7554/eLife.58906. URL https://elifesciences.org/articles/58906.
  22. Incorporating Context into Language Encoding Models for fMRI. preprint, Neuroscience, May 2018. URL http://biorxiv.org/lookup/doi/10.1101/327601.
  23. Language models and brain alignment: beyond word-level semantics and prediction, December 2022. URL http://arxiv.org/abs/2212.00596. arXiv:2212.00596 [cs, q-bio].
  24. Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times? Transactions of the Association for Computational Linguistics, 11:336–350, March 2023. ISSN 2307-387X. doi: 10.1162/tacl˙a˙00548. URL https://doi.org/10.1162/tacl_a_00548.
  25. Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators. Frontiers in Artificial Intelligence, 5:777963, March 2022. ISSN 2624-8212. doi: 10.3389/frai.2022.777963. URL https://www.frontiersin.org/articles/10.3389/frai.2022.777963/full.
  26. Neural Language Taskonomy: Which NLP Tasks are the most Predictive of fMRI Brain Activity? In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  3220–3237, Seattle, United States, 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.naacl-main.235. URL https://aclanthology.org/2022.naacl-main.235.
  27. Deep Neural Networks and Brain Alignment: Brain Encoding and Decoding (Survey), July 2023. URL http://arxiv.org/abs/2307.10246. arXiv:2307.10246 [cs, q-bio].
  28. Training language models to follow instructions with human feedback. ArXiv, abs/2203.02155, 2022. URL https://api.semanticscholar.org/CorpusID:246426909.
  29. Toward a universal decoder of linguistic meaning from brain activation. Nature Communications, 9(1):963, March 2018. ISSN 2041-1723. doi: 10.1038/s41467-018-03068-4. URL https://www.nature.com/articles/s41467-018-03068-4.
  30. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, July 2020. URL http://arxiv.org/abs/1910.10683. arXiv:1910.10683 [cs, stat].
  31. Harry Potter and the Sorcerer’s Stone. Harry Potter. A.A. Levine Books, 1998. ISBN 9780590353403. URL https://books.google.de/books?id=zXgTdQagLGkC.
  32. Towards robust vision by multi-task learning on monkey visual cortex. Advances in Neural Information Processing Systems, 34:739–751, 2021.
  33. Personality Traits in Large Language Models, June 2023. URL http://arxiv.org/abs/2307.00184. arXiv:2307.00184 [cs].
  34. Commonsense Reasoning for Natural Language Processing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, pp.  27–33, Online, 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-tutorials.7. URL https://www.aclweb.org/anthology/2020.acl-tutorials.7.
  35. Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? preprint, Neuroscience, September 2018. URL http://biorxiv.org/lookup/doi/10.1101/407007.
  36. Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence. Neuron, 2020. ISSN 0896-6273. doi: 10.1016/j.neuron.2020.07.040.
  37. The neural architecture of language: Integrative modeling converges on predictive processing. Proceedings of the National Academy of Sciences, 118(45):e2105646118, November 2021. ISSN 0027-8424, 1091-6490. doi: 10.1073/pnas.2105646118. URL https://pnas.org/doi/full/10.1073/pnas.2105646118.
  38. Inducing brain-relevant bias in natural language processing models, October 2019. URL http://arxiv.org/abs/1911.03268. arXiv:1911.03268 [cs, q-bio].
  39. The effect of word predictability on reading time is logarithmic. Cognition, 128(3):302–319, 2013. ISSN 0010-0277. doi: https://doi.org/10.1016/j.cognition.2013.02.013. URL https://www.sciencedirect.com/science/article/pii/S0010027713000413.
  40. Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them, October 2022. URL http://arxiv.org/abs/2210.09261. arXiv:2210.09261 [cs].
  41. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca, 2023.
  42. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain), November 2019. URL http://arxiv.org/abs/1905.11833. arXiv:1905.11833 [cs, q-bio].
  43. LLaMA: Open and Efficient Foundation Language Models, February 2023. URL http://arxiv.org/abs/2302.13971. arXiv:2302.13971 [cs].
  44. Self-instruct: Aligning language model with self generated instructions, 2022a.
  45. Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks, October 2022b. URL http://arxiv.org/abs/2204.07705. arXiv:2204.07705 [cs].
  46. Super-NaturalInstructions: Generalization via declarative instructions on 1600+ NLP tasks. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp.  5085–5109, Abu Dhabi, United Arab Emirates, December 2022c. Association for Computational Linguistics. doi: 10.18653/v1/2022.emnlp-main.340. URL https://aclanthology.org/2022.emnlp-main.340.
  47. Simultaneously Uncovering the Patterns of Brain Regions Involved in Different Story Reading Subprocesses. PLoS ONE, 9(11):e112575, November 2014. ISSN 1932-6203. doi: 10.1371/journal.pone.0112575. URL https://dx.plos.org/10.1371/journal.pone.0112575.
  48. On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior, June 2020. URL http://arxiv.org/abs/2006.01912. arXiv:2006.01912 [cs].
  49. Instruction Tuning for Large Language Models: A Survey, August 2023. URL http://arxiv.org/abs/2308.10792. arXiv:2308.10792 [cs].
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Khai Loong Aw (4 papers)
  2. Syrielle Montariol (22 papers)
  3. Badr AlKhamissi (24 papers)
  4. Martin Schrimpf (18 papers)
  5. Antoine Bosselut (85 papers)
Citations (10)
Reddit Logo Streamline Icon: https://streamlinehq.com