Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification (2112.02690v3)

Published 5 Dec 2021 in cs.AI

Abstract: State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks. However, current approaches are limited to small closed vocabularies which are far from enough for natural communication. In addition, most of the high-performing approaches require data from invasive devices (e.g., ECoG). In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks. We hypothesis that the human brain functions as a special text encoder and propose a novel framework leveraging pre-trained LLMs (e.g., BART). Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines. Furthermore, we show that our proposed model can handle data from various subjects and sources, showing great potential for a high-performance open vocabulary brain-to-text system once sufficient data is available

Citations (61)

View on Semantic Scholar

Summary

The paper introduces a novel EEG-to-text decoding model using a modified BART architecture with an efficient one-step training approach.
It aligns EEG and eye-tracking features to form rich, multidimensional embeddings through a multi-layer Transformer Encoder.
The study achieves effective zero-shot sentiment classification with adapted BERT/RoBERTa models, paving the way for advanced assistive BCIs.

Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification

The paper reports on advancements in decoding electroencephalography (EEG) data into natural language text and performing sentiment classification in a zero-shot setting. Rooted in the exploration of brain-computer interface (BCI) technology, the authors propose a sophisticated pipeline capable of interpreting EEG signals as text, aiming to enhance communication effectiveness for individuals with speech impairments.

Methodology

The authors introduce an EEG-to-text Sequence-To-Sequence model for open-vocabulary decoding, leveraging pre-trained BART architecture with modifications tailored for EEG data. They implemented a comprehensive data processing approach, utilizing the ZuCo dataset, extracting EEG features segmented into eight frequency bands, and aligning these with corresponding eye-tracking features. This alignment process generates rich, multidimensional embeddings, crucial for the subsequent decoding tasks.

The model's architecture incorporates an additional multi-layer Transformer Encoder to learn correlations within EEG patterns, promoting coherent text generation. A notable deviation from baseline methodologies is the employment of one-step training, which proved not only efficient but also delivered comparable results to traditional two-step approaches, significantly reducing computational resources.

In parallel, for zero-shot EEG-based sentiment classification, they adapted fine-tuned text-based classifiers including BERT and RoBERTa models, tuning them on sentiment labeled datasets like the Stanford Sentiment Treebank. The fusion of these models enables classifying sentiments without reliance on extensive labeled EEG sentiment data.

Key Results

The application of these models yielded positive outcomes, demonstrating that the EEG-to-text decoding can handle previously unseen sentences. The sentiment classification performed in a zero-shot manner showed promise, suggesting that EEG signals encapsulate sentiment information that can be decoded effectively when leveraged with robust LLMs.

Implications and Future Directions

This research posits significant implications for both theoretical and practical applications. Theoretically, it strengthens understanding of the complex interactions between neural activities and natural language processing, pushing the envelope in artificial intelligence-driven BCIs. Practically, it opens pathways for developing more sophisticated assistive technologies, potentially improving quality of life for individuals with communication limitations.

Moving forward, the exploration of more advanced deep learning architectures, such as those involving multimodal neural networks, could further enhance model capabilities. Moreover, the approach can be expanded to encompass a broader array of cognitive tasks, enriching the interpretability and usability of BCIs beyond basic communication.

In conclusion, this paper contributes significantly to the niche of language-brain interface research through novel methodological applications and insightful findings, advocating for ongoing advancements and interdisciplinary applications in AI-driven BCIs. The framework and outcomes underscore the potential to decode complex brain signals and generate meaningful natural language outputs, heralding new frontiers in human-computer interaction.

PDF Markdown

Related Papers

YouTube

Show All Videos