Eye-BCI Multimodal Dataset

Updated 28 July 2025

The Eye-BCI multimodal dataset is a comprehensive collection combining high-density EEG/EOG/EMG, precise eye-tracking, and high-speed video to capture ocular movements and neural signals.
It employs tightly time-locked recordings and robust protocols across MI, ME, SSVEP, and P300 paradigms, enabling detailed analysis of ocular artifacts and BCI performance.
Methodological innovations include segmented, meta-annotated trials and adaptive blink-correction techniques that significantly improve signal quality and classification accuracy.

The Eye-BCI Multimodal Dataset refers to a new class of large-scale, simultaneously acquired, multimodal datasets focused on the integration of electroencephalography (EEG) and diverse eye-related signals (including eye-tracking, ocular EMG/EOG, high-speed eye-region video, and auxiliary behavioral or physiological measures), specifically curated for advancing brain–computer interface (BCI) research. Such datasets are structured to enable the multifactor analysis of eye-related movements, their artifacts in neural signals, and their value as intentional channels in BCI systems, offering foundational resources for hybrid BCIs, artifact suppression, robust classification, and detailed cognitive or behavioral analysis. The most recent and distinctive instantiation of this concept—explicitly titled the "Eye-BCI multimodal dataset"—provides simultaneous, highly synchronized recordings across 62–65 EEG/EOG/EMG channels, high-frame-rate eye-tracking, high-speed eyelid video, and rich psychophysiological subject metadata, across multiple canonical BCI paradigms including motor imagery, motor execution, steady-state visually evoked potentials (SSVEP), and P300-based spellers (Guttmann-Flury et al., 9 Jun 2025, Guttmann-Flury et al., 23 Jul 2025).

1. Dataset Structure and Modalities

The Eye-BCI multimodal dataset encompasses three principal signal domains (Guttmann-Flury et al., 9 Jun 2025):

EEG/EOG/EMG: High-density scalp EEG (62–65 electrodes, extended 10/20 distribution), augmented by dedicated EOG and EMG channels for ocular and facial muscle activity, sampled at up to 1000 Hz.
Eye-tracking: Precise, simultaneous acquisition of gaze coordinates, pupil diameter, and other oculometric features at 300 Hz using a Tobii TX300 system.
High-speed video: Synchronous monocular (left-eye) recordings at 150–300 fps with a Phantom Miro M310 camera to resolve eyelid motion and blink dynamics.

Additional non-signal metadata are systematically gathered: demographic and anthropometric variables (age, handedness, cranial and ocular dimensions), state assessments (alertness, sleep, caffeine/hunger status), and fine-grained facial landmarks (automated extraction of canthus distances, eye aperture metrics, and nose–chin lengths from subject photographs).

A total of over 46 hours of multimodal data from 31 healthy subjects (63 sessions) is included, yielding thousands of time-locked, artifact-annotated EEG–eye-tracking–video trial triplets, with aligned trial triggers, behavioral labels, and blink occurrence markers.

2. Experimental Paradigms and Protocol

The dataset comprises four canonical BCI paradigms, selected for their systematic modulation of eye-related activity and neural signals (Guttmann-Flury et al., 9 Jun 2025):

Motor Imagery (MI): Subjects imagine grasping with either hand (20 left, 20 right per session); intended for analyzing non-executed movement-related neural/ocular dynamics.
Motor Execution (ME): Subjects physically perform grasping actions, providing ground truth for sensorimotor engagement and ocular artifact manifestation.
SSVEP: Sustained visual fixation on flickering checkerboards (10–13 Hz, spatially distributed), critical for assessing the impact of fixational eye movement and blinks on evoked potentials.
P300 Speller: Visual oddball paradigm targeting the P3b event-related potential, with dynamic visualization and eye-gaze/fixation–event correspondence.

Each session includes 40 trials for MI/ME/SSVEP, and over 5600 trials for P300 tasks (both 4-letter and 5-letter word paradigms). Synchronization markers embedded in each modality’s timeseries ensure cross-modal alignment at sub-millisecond precision.

3. Technical Innovations and Data Quality Control

Several features distinguish the Eye-BCI multimodal dataset’s technical rigor:

Modal Synchronization: Tight time-locking via hardware and software triggers between EEG systems, high-speed cameras, and eye-trackers ensures that physiological events (e.g., blinks, saccades) are mapped across all signals.
Segmented and Meta-annotated Data: Trials are pre-indexed for cue onset, behavioral target, and event markers (such as blinks, as computed from both eye-tracking and video), enabling trial-level or event-driven analyses.
SNR Characterization: The dataset includes precomputed SNR maps and annotated raw/measured segments for both "signal" (e.g., intended MI/ME/ERP events) and "noise" intervals, adopting SNR estimation formulas such as

$\mathrm{SNRT} = \frac{\sum_{i=1}^N (A_{ts} \cdot x^2(t))}{\sum_{i=1}^N (A_{tn} \cdot x^2(t))}$

where $A_{ts}$ and $A_{tn}$ are durations of signal and noise segments, respectively.

Cross-modal blink annotation: Blinks are detected and time-stamped using frontopolar EEG/EOG, eye-tracking, and direct eyelid video, providing a robust basis for artifact modeling and algorithm benchmarking.

4. Analytical and Algorithmic Applications

The dataset’s breadth enables detailed analyses and algorithm development along several axes (Guttmann-Flury et al., 23 Jul 2025):

Ocular Artifact Suppression: For BCI performance, non-cerebral artifacts—most prominently eye blinks and movements—are major sources of error. The Eye-BCI dataset allows artifact suppression algorithms (such as ICA, ASR, and novel blink-driven methods) to be benchmarked directly against ground-truth blink annotations. The ABCD (Adaptive Blink-Correction and De-Drifting) algorithm, developed and validated on this dataset, outperforms standard artifact rejection methods, achieving 93.81% MI classification accuracy (vs. ICA: 79.29%, ASR: 84.05%) by modeling blink propagation across the scalp, exploiting spatial, temporal, and amplitude-validation criteria, and adaptively removing noisy channels (Guttmann-Flury et al., 23 Jul 2025).
Channel Quality Assessment and Selection: By leveraging the spatial propagation of blinks as a natural test signal, the ABCD method systematically detects “bad” channels—characterized by low similarity to neighboring blink artifacts or pathological propagation (e.g., bridging/wiring faults)—and removes them prior to further analysis, demonstrably increasing SNR and robustifying classification (Guttmann-Flury et al., 23 Jul 2025).
Cross-paradigm Classification: With identical participants performing varied paradigms, the dataset enables evaluation of algorithm and model robustness not only within-paradigm but across tasks; for example, analyzing whether blink-robust feature extraction for MI also enhances SSVEP or P300 decoding.
Behavioral and Cognitive Analytics: Rich subject metadata and high-resolution, time-aligned multimodal signals support explorations into the relationship between eye behavior (blink rate, saccadic dynamics, fixation patterns), alertness/fatigue state, and BCI performance.

5. Impact, Accessibility, and Use Cases

The Eye-BCI dataset is positioned as a benchmark resource for the BCI and neurotechnology research communities (Guttmann-Flury et al., 9 Jun 2025):

Artifact Correction R&D: It supports the development and validation of artifact removal methods (including new blink-propagation based approaches), as well as the paper of how different BCI tasks modulate or are affected by ocular artifacts.
Hybrid BCI/Adaptive Interfaces: The dataset’s coverage of both intentional oculomotor phenomena and spontaneous/blink-induced artifacts enables research on fully hybrid EEG–eye-based BCIs, closed-loop gaze/neural adaptive systems, and artifact-aware, context-sensitive signal processing.
General Data Science and Behavioral Modeling: Auxiliary data (demographics, facial geometry, state metadata) and open-access availability facilitate machine learning and statistical studies on inter-individual and intra-individual variability, multimodal biometrics, and human–computer interaction analytics.

The dataset is released under CC0 licensing for unrestricted non-commercial research use and is accompanied by open data loading utilities in Python, Matlab, and R, with code repositories available via GitHub (Guttmann-Flury et al., 9 Jun 2025). All recordings are anonymized and subject to ethical guidelines ensuring subject confidentiality.

6. Methodological Significance and Future Directions

By combining EEG, eye-tracking, and high-speed video at scale and with synchronized precision, the Eye-BCI multimodal dataset defines a reference standard for ocular artifact research and cross-modal BCI algorithm development. The demonstration that blink-propagation signatures can be harnessed not only for artifact removal but for automatic channel/dataset quality assessment, as validated by high-accuracy MI classification post–ABCD removal (Guttmann-Flury et al., 23 Jul 2025), suggests that physiological “noise” can be transformed into a source of signal quality monitoring.

Future directions highlighted by dataset authors include:

Broader integration with additional modalities (e.g., egocentric audio, EMG beyond facial channels)
Extension to clinical and patient populations
Methodology transfer to real-time, online BCI systems and adaptive artifact monitoring
Further linking of ocular metrics to cognitive and affective state estimation for holistic user modeling across diverse BCI paradigms

The Eye-BCI multimodal dataset thus serves not only as a benchmark for technical algorithm performance but also as a platform for advanced cognitive and computational neuroscience research.