A Temporal-Spectral Fusion Transformer with Subject-Specific Adapter for Enhancing RSVP-BCI Decoding (2401.06340v2)
Abstract: The Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interface (BCI) is an efficient technology for target retrieval using electroencephalography (EEG) signals. The performance improvement of traditional decoding methods relies on a substantial amount of training data from new test subjects, which increases preparation time for BCI systems. Several studies introduce data from existing subjects to reduce the dependence of performance improvement on data from new subjects, but their optimization strategy based on adversarial learning with extensive data increases training time during the preparation procedure. Moreover, most previous methods only focus on the single-view information of EEG signals, but ignore the information from other views which may further improve performance. To enhance decoding performance while reducing preparation time, we propose a Temporal-Spectral fusion transformer with Subject-specific Adapter (TSformer-SA). Specifically, a cross-view interaction module is proposed to facilitate information transfer and extract common representations across two-view features extracted from EEG temporal signals and spectrogram images. Then, an attention-based fusion module fuses the features of two views to obtain comprehensive discriminative features for classification. Furthermore, a multi-view consistency loss is proposed to maximize the feature similarity between two views of the same EEG signal. Finally, we propose a subject-specific adapter to rapidly transfer the knowledge of the model trained on data from existing subjects to decode data from new subjects. Experimental results show that TSformer-SA significantly outperforms comparison methods and achieves outstanding performance with limited training data from new subjects. This facilitates efficient decoding and rapid deployment of BCI systems in practical use.
- J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, and T. M. Vaughan, “Brain–computer interfaces for communication and control,” Clinical neurophysiology, vol. 113, no. 6, pp. 767–791, 2002.
- L. F. Nicolas-Alonso and J. Gomez-Gil, “Brain computer interfaces, a review,” sensors, vol. 12, no. 2, pp. 1211–1279, 2012.
- K. K. Ang and C. Guan, “Brain-computer interface in stroke rehabilitation,” Journal of Computing Science and Engineering, vol. 7, no. 2, pp. 139–146, 2013.
- H. Cecotti, “Single-trial detection with magnetoencephalography during a dual-rapid serial visual presentation task,” IEEE Transactions on Biomedical Engineering, vol. 63, no. 1, pp. 220–227, 2015.
- A. R. Marathe, A. J. Ries, V. J. Lawhern, B. J. Lance, J. Touryan, K. McDowell, and H. Cecotti, “The effect of target and non-target similarity on neural classification performance: a boost from confidence,” Frontiers in neuroscience, vol. 9, p. 270, 2015.
- S. Lees, N. Dayan, H. Cecotti, P. McCullagh, L. Maguire, F. Lotte, and D. Coyle, “A review of rapid serial visual presentation-based brain–computer interfaces,” Journal of neural engineering, vol. 15, no. 2, p. 021001, 2018.
- H. Cecotti and A. Graser, “Convolutional neural networks for p300 detection with application to brain-computer interfaces,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 3, pp. 433–445, 2010.
- G. F. Alpert, R. Manor, A. B. Spanier, L. Y. Deouell, and A. B. Geva, “Spatiotemporal representations of rapid visual target detection: A single-trial eeg classification algorithm,” IEEE Transactions on Biomedical Engineering, vol. 61, no. 8, pp. 2290–2303, 2013.
- A. R. Marathe, V. J. Lawhern, D. Wu, D. Slayback, and B. J. Lance, “Improved neural signal classification in a rapid serial visual presentation task using active learning,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 24, no. 3, pp. 333–343, 2015.
- L. Acqualagna and B. Blankertz, “Gaze-independent bci-spelling using rapid serial visual presentation (rsvp),” Clinical Neurophysiology, vol. 124, no. 5, pp. 901–908, 2013.
- C. Barngrover, A. Althoff, P. DeGuzman, and R. Kastner, “A brain–computer interface (bci) for the detection of mine-like objects in sidescan sonar imagery,” IEEE journal of oceanic engineering, vol. 41, no. 1, pp. 123–138, 2015.
- Q. Wu, B. Yan, Y. Zeng, C. Zhang, and L. Tong, “Anti-deception: reliable eeg-based biometrics with real-time capability from the neural response of face rapid serial visual presentation,” Biomedical engineering online, vol. 17, no. 1, pp. 1–16, 2018.
- H. Cecotti, M. P. Eckstein, and B. Giesbrecht, “Single-trial classification of event-related potentials in rapid serial visual presentation tasks using supervised spatial filtering,” IEEE transactions on neural networks and learning systems, vol. 25, no. 11, pp. 2030–2042, 2014.
- K. C. Squires, C. Wickens, N. K. Squires, and E. Donchin, “The effect of stimulus sequence on the waveform of the cortical event-related potential,” Science, vol. 193, no. 4258, pp. 1142–1146, 1976.
- J. Polich, “Updating p300: an integrative theory of p3a and p3b,” Clinical neurophysiology, vol. 118, no. 10, pp. 2128–2148, 2007.
- A. Gerson, L. Parra, and P. Sajda, “Cortically coupled computer vision for rapid image search,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 14, no. 2, pp. 174–179, 2006.
- A. Barachant and M. Congedo, “A plug&play P300 BCI using information geometry,” CoRR, vol. abs/1409.0107, 2014.
- S. Makeig, M. Westerfield, T.-P. Jung, S. Enghoff, J. Townsend, E. Courchesne, and T. J. Sejnowski, “Dynamic brain sources of visual evoked responses,” Science, vol. 295, no. 5555, pp. 690–694, 2002.
- S. Makeig, S. Debener, J. Onton, and A. Delorme, “Mining event-related brain dynamics,” Trends in cognitive sciences, vol. 8, no. 5, pp. 204–210, 2004.
- K. A. Robbins, J. Touryan, T. Mullen, C. Kothe, and N. Bigdely-Shamlo, “How sensitive are eeg results to preprocessing methods: a benchmarking study,” IEEE transactions on neural systems and rehabilitation engineering, vol. 28, no. 5, pp. 1081–1090, 2020.
- V. Bostanov, “Bci competition 2003-data sets ib and iib: feature extraction from event-related brain potentials with the continuous wavelet transform and the t-value scalogram,” IEEE Transactions on Biomedical engineering, vol. 51, no. 6, pp. 1057–1061, 2004.
- B. J. Roach and D. H. Mathalon, “Event-related eeg time-frequency analysis: an overview of measures and an analysis of early gamma band phase locking in schizophrenia,” Schizophrenia bulletin, vol. 34, no. 5, pp. 907–926, 2008.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
- Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
- R. Manor and A. B. Geva, “Convolutional neural network for multi-category rapid serial visual presentation bci,” Frontiers in computational neuroscience, vol. 9, p. 146, 2015.
- V. J. Lawhern, A. J. Solon, N. R. Waytowich, S. M. Gordon, C. P. Hung, and B. J. Lance, “Eegnet: a compact convolutional neural network for eeg-based brain–computer interfaces,” Journal of neural engineering, vol. 15, no. 5, p. 056013, 2018.
- H. Shan, Y. Liu, and T. P. Stefanov, “A simple convolutional neural network for accurate p300 detection and character spelling in brain computer interface.,” in IJCAI, pp. 1604–1610, 2018.
- B. Zang, Y. Lin, Z. Liu, and X. Gao, “A deep learning method for single-trial eeg classification in rsvp task based on spatiotemporal features of erps,” Journal of Neural Engineering, vol. 18, no. 4, p. 0460c8, 2021.
- F. Li, C. Wang, Y. Li, H. Wu, B. Fu, Y. Ji, Y. Niu, and G. Shi, “Phase preservation neural network for electroencephalography classification in rapid serial visual presentation task,” IEEE Transactions on Biomedical Engineering, 2021.
- M. Tajmirriahi, Z. Amini, H. Rabbani, and R. Kafieh, “An interpretable convolutional neural network for p300 detection: Analysis of time frequency features for limited data,” IEEE Sensors Journal, vol. 22, no. 9, pp. 8685–8692, 2022.
- P. Havaei, M. Zekri, E. Mahmoudzadeh, and H. Rabbani, “An efficient deep learning framework for p300 evoked related potential detection in eeg signal,” Computer Methods and Programs in Biomedicine, vol. 229, p. 107324, 2023.
- H. Morioka, A. Kanemura, J.-i. Hirayama, M. Shikauchi, T. Ogawa, S. Ikeda, M. Kawanabe, and S. Ishii, “Learning a common dictionary for subject-transfer decoding with resting calibration,” NeuroImage, vol. 111, pp. 167–178, 2015.
- X. Li, W. Wei, S. Qiu, and H. He, “Tff-former: Temporal-frequency fusion transformer for zero-training decoding of two bci tasks,” in Proceedings of the 30th ACM international conference on multimedia, pp. 51–59, 2022.
- W. Wei, S. Qiu, Y. Zhang, J. Mao, and H. He, “Erp prototypical matching net: a meta-learning method for zero-calibration rsvp-based image retrieval,” Journal of Neural Engineering, vol. 19, no. 2, p. 026028, 2022.
- W. Wei, S. Qiu, X. Ma, D. Li, B. Wang, and H. He, “Reducing calibration efforts in rsvp tasks with multi-source adversarial domain adaptation,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 28, no. 11, pp. 2344–2355, 2020.
- L. Fan, H. Shen, F. Xie, J. Su, Y. Yu, and D. Hu, “Dc-tcnn: A deep model for eeg-based detection of dim targets,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 30, pp. 1727–1736, 2022.
- X. Tian, Z. Deng, W. Ying, K.-S. Choi, D. Wu, B. Qin, J. Wang, H. Shen, and S. Wang, “Deep multi-view feature learning for eeg-based epileptic seizure detection,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 27, no. 10, pp. 1962–1972, 2019.
- M. Sun, W. Cui, S. Yu, H. Han, B. Hu, and Y. Li, “A dual-branch dynamic graph convolution based adaptive transformer feature fusion network for eeg emotion recognition,” IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 2218–2228, 2022.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- Y.-H. H. Tsai, S. Bai, P. P. Liang, J. Z. Kolter, L.-P. Morency, and R. Salakhutdinov, “Multimodal transformer for unaligned multimodal language sequences,” in Proceedings of the conference. Association for Computational Linguistics. Meeting, vol. 2019, p. 6558, NIH Public Access, 2019.
- P. Xu, X. Zhu, and D. A. Clifton, “Multimodal learning with transformers: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient transfer learning for nlp,” in International Conference on Machine Learning, pp. 2790–2799, PMLR, 2019.
- N. R. Waytowich, V. J. Lawhern, A. W. Bohannon, K. R. Ball, and B. J. Lance, “Spectral transfer learning using information geometry for a user-independent brain-computer interface,” Frontiers in neuroscience, vol. 10, p. 430, 2016.
- F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258, 2017.
- J. Lee, K. Won, M. Kwon, S. C. Jun, and M. Ahn, “Cnn with large data achieves true zero-training in online p300 brain-computer interface,” IEEE Access, vol. 8, pp. 74385–74400, 2020.
- P. Ramachandran, N. Parmar, A. Vaswani, I. Bello, A. Levskaya, and J. Shlens, “Stand-alone self-attention in vision models,” Advances in Neural Information Processing Systems, vol. 32, 2019.
- M. M. Naseer, K. Ranasinghe, S. H. Khan, M. Hayat, F. Shahbaz Khan, and M.-H. Yang, “Intriguing properties of vision transformers,” Advances in Neural Information Processing Systems, vol. 34, 2021.
- Z. Wang, Y. Wang, C. Hu, Z. Yin, and Y. Song, “Transformers for eeg-based emotion recognition: A hierarchical spatial information learning model,” IEEE Sensors Journal, 2022.
- R. Casal, L. E. Di Persia, and G. Schlotthauer, “Temporal convolutional networks and transformers for classifying the sleep stage in awake or asleep using pulse oximetry signals,” Journal of Computational Science, p. 101544, 2022.
- J. Chen, Y. Zhang, Y. Pan, P. Xu, and C. Guan, “A transformer-based deep neural network model for ssvep classification,” Neural Networks, vol. 164, pp. 521–534, 2023.
- Y. Song, Q. Zheng, B. Liu, and X. Gao, “Eeg conformer: Convolutional transformer for eeg decoding and visualization,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 31, pp. 710–719, 2022.
- Y. Li, M. Yang, and Z. Zhang, “A survey of multi-view representation learning,” IEEE transactions on knowledge and data engineering, vol. 31, no. 10, pp. 1863–1883, 2018.
- Y. Li, L. Guo, Y. Liu, J. Liu, and F. Meng, “A temporal-spectral-based squeeze-and-excitation feature fusion network for motor imagery eeg decoding,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 29, pp. 1534–1545, 2021.
- J. Luo, W. Cui, S. Xu, L. Wang, X. Li, X. Liao, and Y. Li, “A dual-branch spatio-temporal-spectral transformer feature fusion network for eeg-based visual recognition,” IEEE Transactions on Industrial Informatics, 2023.
- K. Li, G. Wan, G. Cheng, L. Meng, and J. Han, “Object detection in optical remote sensing images: A survey and a new benchmark,” ISPRS journal of photogrammetry and remote sensing, vol. 159, pp. 296–307, 2020.
- A. Torralba, K. Murphy, and W. Freeman, “The mit-csail database of objects and scenes,” 2009.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
- Y. Wang, X. Chen, L. Cao, W. Huang, F. Sun, and Y. Wang, “Multimodal token fusion for vision transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12186–12195, 2022.
- T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in International conference on machine learning, pp. 1597–1607, PMLR, 2020.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- S. H. Patel and P. N. Azzam, “Characterization of n200 and p300: selected studies of the event-related potential,” International journal of medical sciences, vol. 2, no. 4, p. 147, 2005.
- V. Kolev, T. Demiralp, J. Yordanova, A. Ademoglu, and Ü. Isoglu-Alkaç, “Time–frequency analysis reveals multiple functional components during oddball p300,” NeuroReport, vol. 8, no. 8, pp. 2061–2065, 1997.
- L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.,” Journal of machine learning research, vol. 9, no. 11, 2008.
- R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE international conference on computer vision, pp. 618–626, 2017.
- Xujin Li (3 papers)
- Wei Wei (425 papers)
- Shuang Qiu (46 papers)
- Huiguang He (26 papers)