EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More (2407.01884v1)
Abstract: Recently, electroencephalography (EEG) signals have been actively incorporated to decode brain activity to visual or textual stimuli and achieve object recognition in multi-modal AI. Accordingly, endeavors have been focused on building EEG-based datasets from visual or textual single-modal stimuli. However, these datasets offer limited EEG epochs per category, and the complex semantics of stimuli presented to participants compromise their quality and fidelity in capturing precise brain activity. The study in neuroscience unveils that the relationship between visual and textual stimulus in EEG recordings provides valuable insights into the brain's ability to process and integrate multi-modal information simultaneously. Inspired by this, we propose a novel large-scale multi-modal dataset, named EIT-1M, with over 1 million EEG-image-text pairs. Our dataset is superior in its capacity of reflecting brain activities in simultaneously processing multi-modal information. To achieve this, we collected data pairs while participants viewed alternating sequences of visual-textual stimuli from 60K natural images and category-specific texts. Common semantic categories are also included to elicit better reactions from participants' brains. Meanwhile, response-based stimulus timing and repetition across blocks and sessions are included to ensure data diversity. To verify the effectiveness of EIT-1M, we provide an in-depth analysis of EEG data captured from multi-modal stimuli across different categories and participants, along with data quality scores for transparency. We demonstrate its validity on two tasks: 1) EEG recognition from visual or textual stimuli or both and 2) EEG-to-visual generation.
- Object classification from randomized EEG trials. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pages 3845–3854. Computer Vision Foundation / IEEE, 2021.
- Eeg2text: Decoding text from brain signals. In Proceedings of the 36th International Conference on Machine Learning (ICML), pages 12345–12353, 2019.
- Brain decoding: toward real-time reconstruction of visual perception. arXiv preprint arXiv:2310.19812, 2023.
- Still an ineffective method with supertrials/erps - comments on ”decoding brain representations by multimodal learning of neural activity and visual features”. IEEE Trans. Pattern Anal. Mach. Intell., 45(11):14052–14054, 2023.
- Chasing day and night: Towards robust and efficient all-day object detection guided by an event camera. arXiv preprint arXiv:2309.09297, 2023.
- Clip is also a good teacher: A new learning framework for inductive zero-shot semantic segmentation. arXiv preprint arXiv:2310.02296, 2023.
- Michael X Cohen. Where does eeg come from and what does it mean? Trends in neurosciences, 40(4):208–218, 2017.
- Spatial and temporal features of superordinate semantic processing studied with fmri and eeg. Frontiers in human neuroscience, 7:293, 2013.
- A large and rich eeg dataset for modeling human visual object recognition. NeuroImage, 264:119754, 2022a.
- A large and rich eeg dataset for modeling human visual object recognition. NeuroImage, 264:119754, 2022b.
- Human eeg recordings for 1,854 concepts presented in rapid serial visual presentation streams. Scientific Data, 9(1):3, 2022.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Zuco, a simultaneous eeg and eye-tracking resource for natural sentence reading. Scientific data, 5(1):1–13, 2018.
- Zuco 2.0: A dataset of physiological recordings during natural reading and annotation. arXiv preprint arXiv:1912.00903, 2019.
- Decoding eeg brain activity for multi-modal natural language processing. Frontiers in Human Neuroscience, 15:659410, 2021.
- Helene Intraub. Rapid conceptual identification of sequentially presented pictures. Journal of Experimental Psychology: Human Perception and Performance, 7(3):604, 1981.
- Brain2Image: Converting brain signals into images. In Proceedings of the 2017 ACM on Multimedia Conference, MM 2017, Mountain View, CA, USA, October 23-27, 2017, pages 1809–1817. ACM, 2017a.
- Brain2image: Converting brain signals into images. In Proceedings of the 25th ACM international conference on Multimedia, pages 1809–1817, 2017b.
- The speed of sight. Journal of cognitive neuroscience, 13(1):90–101, 2001.
- Learning multiple layers of features from tiny images. 2009.
- Envisioned speech recognition using eeg sensors. Personal and Ubiquitous Computing, 22:185–199, 2018.
- Eegnet: a compact convolutional neural network for eeg-based brain–computer interfaces. Journal of neural engineering, 15(5):056013, 2018.
- The perils and pitfalls of block design for EEG classification experiments. IEEE Trans. Pattern Anal. Mach. Intell., 43(1):316–333, 2021.
- Microsoft COCO: common objects in context. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V, pages 740–755. Springer, 2014.
- Omnibind: Teach to build unequal-scale modality interaction for omni-bind of all. arXiv preprint arXiv:2405.16108, 2024a.
- Image anything: Towards reasoning-coherent and training-free multi-modal image generation. arXiv preprint arXiv:2401.17664, 2024b.
- Very high density eeg elucidates spatiotemporal aspects of early visual processing. Scientific reports, 7(1):16248, 2017.
- Deep learning-based electroencephalography analysis: a systematic review. Journal of neural engineering, 16(5):051001, 2019.
- Neural decoding of eeg signals with machine learning: A systematic review. Brain Sciences, 11(11):1525, 2021.
- Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
- Eeg2image: image reconstruction from eeg brain signals. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023.
- Learning robust deep visual representations from eeg brain recordings. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 7553–7562, 2024.
- Michal Teplan et al. Fundamentals of eeg measurement. Measurement science review, 2(2):1–11, 2002.
- Thoughtviz: Visualizing human thoughts using generative adversarial network. In Proceedings of the 26th ACM international conference on Multimedia, pages 950–958, 2018.
- Mindbigdata 2022 A large dataset of brain signals. CoRR, abs/2212.14746, 2022.
- Eeg datasets for seizure detection and prediction—a review. Epilepsia Open, 8(2):252–267, 2023.
- Alljoined - A dataset for eeg-to-image decoding. CoRR, abs/2404.05553, 2024.
- Eventdance: Unsupervised source-free cross-modal adaptation for event-based object recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17448–17458, 2024.
- Deep learning for event-based vision: A comprehensive survey and benchmarks. arXiv preprint arXiv:2302.08890, 2023.
- E-clip: Towards label-efficient event-based open-world understanding by clip. arXiv preprint arXiv:2308.03135, 2023.
- Exact: Language-guided conceptual reasoning and uncertainty estimation for event-based action recognition and more. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18633–18643, 2024.
- Xu Zheng (88 papers)
- Ling Wang (89 papers)
- Kanghao Chen (11 papers)
- Yuanhuiyi Lyu (25 papers)
- Jiazhou Zhou (9 papers)
- Lin Wang (403 papers)