Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Representation Learning for Open Vocabulary Electroencephalography-to-Text Decoding (2312.09430v1)

Published 15 Nov 2023 in eess.SP, cs.CL, cs.HC, and cs.LG

Abstract: Previous research has demonstrated the potential of using pre-trained LLMs for decoding open vocabulary Electroencephalography (EEG) signals captured through a non-invasive Brain-Computer Interface (BCI). However, the impact of embedding EEG signals in the context of LLMs and the effect of subjectivity, remain unexplored, leading to uncertainty about the best approach to enhance decoding performance. Additionally, current evaluation metrics used to assess decoding effectiveness are predominantly syntactic and do not provide insights into the comprehensibility of the decoded output for human understanding. We present an end-to-end deep learning framework for non-invasive brain recordings that brings modern representational learning approaches to neuroscience. Our proposal introduces the following innovations: 1) an end-to-end deep learning architecture for open vocabulary EEG decoding, incorporating a subject-dependent representation learning module for raw EEG encoding, a BART LLM, and a GPT-4 sentence refinement module; 2) a more comprehensive sentence-level evaluation metric based on the BERTScore; 3) an ablation study that analyses the contributions of each module within our proposal, providing valuable insights for future research. We evaluate our approach on two publicly available datasets, ZuCo v1.0 and v2.0, comprising EEG recordings of 30 subjects engaged in natural reading tasks. Our model achieves a BLEU-1 score of 42.75%, a ROUGE-1-F of 33.28%, and a BERTScore-F of 53.86%, outperforming the previous state-of-the-art methods by 3.38%, 8.43%, and 6.31%, respectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Speech synthesis from neural decoding of spoken sentences. Nature, 568(7753): 493–498.
  2. Layer normalization. arXiv preprint arXiv:1607.06450.
  3. Imagined speech classification with EEG signals for silent communication: a preliminary investigation into synthetic telepathy. In 2010 4th International Conference on Bioinformatics and Biomedical Engineering, 1–4. IEEE.
  4. Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech. Current Biology, 28(5): 803–809.
  5. Brains and algorithms partially converge in natural language processing. Communications biology, 5(1): 134.
  6. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259.
  7. Decoding imagined and spoken phrases from non-invasive neural (MEG) signals. Frontiers in neuroscience, 14: 290.
  8. Decoding speech perception from non-invasive brain recordings. Nature Machine Intelligence, 1–11.
  9. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  10. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
  11. DeWave: Discrete EEG Waves Encoding for Brain Dynamics to Text Translation. arXiv preprint arXiv:2309.14030.
  12. Semantic-aware Contrastive Learning for Electroencephalography-to-Text Generation with Curriculum Learning. arXiv preprint arXiv:2301.09237.
  13. Does the brain represent words? An evaluation of brain decoding studies of language understanding. arXiv preprint arXiv:1806.00591.
  14. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415.
  15. ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading. Scientific data, 5(1): 1–13.
  16. ZuCo 2.0: A dataset of physiological recordings during natural reading and annotation. arXiv preprint arXiv:1912.00903.
  17. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature, 532(7600): 453–458.
  18. Virtual typing by people with tetraplegia using a self-calibrating intracortical brain-computer interface. Science translational medicine, 7(313): 313ra179–313ra179.
  19. Low-dimensional subject representation-based transfer learning in EEG decoding. IEEE Journal of Biomedical and Health Informatics, 25(6): 1915–1925.
  20. A high performance spelling system based on EEG-EOG signals with visual feedback. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(7): 1443–1459.
  21. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
  22. Lin, C.-Y. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, 74–81.
  23. Machine translation of cortical activity to text with an encoder–decoder framework. Nature neuroscience, 23(4): 575–582.
  24. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. New England Journal of Medicine, 385(3): 217–227.
  25. Thinking out loud, an open-access EEG-based BCI dataset for inner speech recognition. Scientific Data, 9(1): 52.
  26. OpenAI. 2023. GPT-4 Technical Report. ArXiv, abs/2303.08774.
  27. Decoding covert speech from EEG-a comprehensive review. Frontiers in Neuroscience, 15: 392.
  28. High performance communication by people with paralysis using an intracortical brain-computer interface. Elife, 6: e18554.
  29. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 311–318.
  30. Toward a universal decoder of linguistic meaning from brain activation. Nature communications, 9(1): 963.
  31. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1): 5485–5551.
  32. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, 1631–1642.
  33. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1): 1929–1958.
  34. Semantic reconstruction of continuous language from non-invasive brain recordings. Nature Neuroscience, 1–9.
  35. Attention is all you need. Advances in neural information processing systems, 30.
  36. BrainBERT: Self-supervised representation learning for intracranial recordings. arXiv preprint arXiv:2302.14367.
  37. Open vocabulary electroencephalography-to-text decoding and zero-shot sentiment classification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 5350–5358.
  38. High-performance brain-to-text communication via handwriting. Nature, 593(7858): 249–254.
  39. A high-performance speech neuroprosthesis. BioRxiv, 2023–01.
  40. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Hamza Amrani (3 papers)
  2. Daniela Micucci (35 papers)
  3. Paolo Napoletano (30 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.