Language Reconstruction with Brain Predictive Coding from fMRI Data (2405.11597v1)
Abstract: Many recent studies have shown that the perception of speech can be decoded from brain signals and subsequently reconstructed as continuous language. However, there is a lack of neurological basis for how the semantic information embedded within brain signals can be used more effectively to guide language reconstruction. The theory of predictive coding suggests that human brain naturally engages in continuously predicting future word representations that span multiple timescales. This implies that the decoding of brain signals could potentially be associated with a predictable future. To explore the predictive coding theory within the context of language reconstruction, this paper proposes a novel model \textsc{PredFT} for jointly modeling neural decoding and brain prediction. It consists of a main decoding network for language reconstruction and a side network for predictive coding. The side network obtains brain predictive coding representation from related brain regions of interest with a multi-head self-attention module. This representation is fused into the main decoding network with cross-attention to facilitate the LLMs' generation process. Experiments are conducted on the largest naturalistic language comprehension fMRI dataset Narratives. \textsc{PredFT} achieves current state-of-the-art decoding performance with a maximum BLEU-1 score of $27.8\%$.
- Brain2word: decoding brain activity for language generation. arXiv preprint arXiv:2009.04765, 2020.
- Predictive coding or just feature discovery? an alternative account of why language models fit brain data. Neurobiology of Language, 5(1):64–79, 2024.
- Localising memory retrieval and syntactic composition: an fmri study of naturalistic language comprehension. Language, Cognition and Neuroscience, 34(4):491–510, 2019.
- Deep language algorithms predict semantic comprehension from brain activity. Scientific reports, 12(1):16327, 2022.
- Evidence of a predictive coding hierarchy in the human brain listening to speech. Nature human behaviour, 7(3):430–441, 2023.
- Automatic parcellation of human cortical gyri and sulci using standard anatomical nomenclature. NeuroImage, 53(1):1–15, 2010. ISSN 1053-8119. doi: https://doi.org/10.1016/j.neuroimage.2010.06.010. URL https://www.sciencedirect.com/science/article/pii/S1053811910008542.
- Two distinct neural timescales for predictive speech processing. Neuron, 105(2):385–393, 2020.
- Karl Friston. Hierarchical models in the brain. PLoS computational biology, 4(11):e1000211, 2008.
- Predictive coding under the free-energy principle. Philosophical transactions of the Royal Society B: Biological sciences, 364(1521):1211–1221, 2009.
- The mismatch negativity: a review of underlying mechanisms. Clinical neurophysiology, 120(3):453–463, 2009.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- A hierarchy of linguistic predictions during natural language comprehension. Proceedings of the National Academy of Sciences, 119(32):e2201968119, 2022.
- Natural speech reveals the semantic maps that tile human cerebral cortex. Nature, 532(7600):453–458, 2016.
- 3d convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell., 35(1):221–231, 2013. doi: 10.1109/TPAMI.2012.59. URL https://doi.org/10.1109/TPAMI.2012.59.
- A natural language fmri dataset for voxelwise encoding models. Scientific Data, 10(1):555, 2023.
- Convolutional networks for images, speech, and time series.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.703. URL https://aclanthology.org/2020.acl-main.703.
- Estimating the delay of the fmri response. NeuroImage, 16(3):593–606, 2002.
- Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
- An interactive activation model of context effects in letter perception: I. an account of basic findings. Psychological review, 88(5):375, 1981.
- Predictive coding: a theoretical and experimental review, 2022.
- David Mumford. On the computational architecture of the neocortex: I. the role of the thalamo-cortical loop. Biological cybernetics, 65(2):135–145, 1991.
- The “narratives” fmri dataset for evaluating models of naturalistic language comprehension. Scientific data, 8(1):250, 2021.
- Neural evidence for predictive coding in auditory cortex during speech production. Psychonomic bulletin & review, 25:423–430, 2018.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002.
- Toward a universal decoder of linguistic meaning from brain activation. Nature communications, 9(1):963, 2018.
- Improving language understanding by generative pre-training. 2018.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature neuroscience, 2(1):79–87, 1999.
- Reconstructing the mind’s eye: fmri-to-image with contrastive learning and diffusion priors. Advances in Neural Information Processing Systems, 36, 2024.
- Towards sentence-level brain decoding with distributed representations. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):7047–7054, Jul. 2019. doi: 10.1609/aaai.v33i01.33017047. URL https://ojs.aaai.org/index.php/AAAI/article/view/4685.
- Sequence to sequence learning with neural networks. In Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger, editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 3104–3112, 2014. URL https://proceedings.neurips.cc/paper/2014/hash/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html.
- Semantic reconstruction of continuous language from non-invasive brain recordings. Nature Neuroscience, 26(5):858–866, 2023.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Evidence for a hierarchy of predictions and prediction errors in human cortex. Proceedings of the National Academy of Sciences, 108(51):20754–20759, 2011.
- Fine-grained neural decoding with distributed word representations. Information Sciences, 507:256–272, 2020.
- Mindbridge: A cross-subject brain decoding framework. arXiv preprint arXiv:2404.07850, 2024.
- Prediction during natural language comprehension. Cerebral cortex, 26(6):2506–2516, 2016.
- Group normalization. In Proceedings of the European conference on computer vision (ECCV), pages 3–19, 2018.
- UniCoRN: Unified cognitive signal ReconstructioN bridging cognitive signals and human language. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13277–13291, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-long.741. URL https://aclanthology.org/2023.acl-long.741.
- Language generation from human brain activities. arXiv preprint arXiv:2311.09889, 2023.
- Data contamination issues in brain-to-text decoding. arXiv preprint arXiv:2312.10987, 2023.
- Towards brain-to-text generation: Neural decoding with pre-trained encoder-decoder models. In NeurIPS 2021 AI for Science Workshop, 2021.