Papers
Topics
Authors
Recent
2000 character limit reached

ZUCO 2.0: EEG & Eye-Tracking Dataset

Updated 27 December 2025
  • ZUCO 2.0 is a publicly available multimodal dataset that integrates high-density EEG and eye-tracking recordings during natural reading tasks.
  • The dataset features precise word- and sentence-level temporal alignment, detailed preprocessing pipelines, and comprehensive annotations for cognitive and affective studies.
  • It empowers advanced neuroscience and machine learning research with standardized data formats, rich participant metadata, and reproducible experimental design.

The ZUCO 2.0 Dataset is a publicly available multimodal corpus composed of high-density electroencephalography (EEG) and eye-tracking recordings captured from healthy adults during natural reading tasks. Designed for cognitive neuroscience and affective computing research, ZUCO 2.0 provides temporally aligned neural and ocular activity data at both the word and sentence levels, supporting integrative studies of language processing, sentiment analysis, and the cognitive underpinnings of annotation tasks. The dataset’s structure, annotation, and metadata organization conform to best practices for large-scale physiological corpora, facilitating reproducibility and multimodal machine learning research (Bhardwaj et al., 20 Dec 2025, Hollenstein et al., 2019, Hollenstein et al., 2021).

1. Participant Cohort and Experimental Design

  • Participants: ZUCO 2.0 contains data from 18 healthy adults (mixed gender, ages 18–35; native English speakers with normal or corrected vision; mean LexTALE vocabulary score ≈88.5% (Hollenstein et al., 2019, Hollenstein et al., 2021)). No participant reported neurological or language disorders.
  • Task Paradigms: The experimental design features two main conditions in interleaved blocks:
    • Natural Reading (NR): Subjects read 349 English sentences for comprehension. 12% of trials are followed by content questions (3-way forced choice) to verify engagement.
    • Task-Specific Reading (TSR): Subjects read 390 sentences with the explicit goal of detecting one of seven predefined relation types (e.g., “employer,” “nationality”) within the text. 17% are control sentences with no annotatable relation present.
  • Stimulus Properties: Sentences (mean length ≈19.6–21.3 words) are drawn from Wikipedia with lexical and readability balancing (NR Flesch 55.38, TSR Flesch 50.76). 63 sentences are duplicated across both tasks for within-subject comparisons (Hollenstein et al., 2019, Hollenstein et al., 2021).
  • Timing Structure: Trials are self-paced; each word presentation and fixation is precisely time-stamped. Epoch segmentation is performed around each word onset, typically [–200 ms, +600 ms] relative to stimulus, yielding approximately 50,000 word-aligned epochs per subject (Bhardwaj et al., 20 Dec 2025).

2. Data Acquisition: EEG and Eye-Tracking Protocols

EEG Acquisition

  • Recording System: Geodesic Hydrocel 128-channel net (Electrical Geodesics) with 500 Hz sampling, 24-bit DC-coupled amplifiers, input impedance >100 MΩ (Bhardwaj et al., 20 Dec 2025, Hollenstein et al., 2019, Hollenstein et al., 2021).
  • Montage: Extended 10–5 electrode layout provides dense coverage of frontal, central, parietal, occipital, and temporal regions.
  • Reference: Recordings use a common average reference, subsequently re-referenced during preprocessing. Impedances are maintained below 40 kΩ.
  • Signal Model: The raw data tensor is formalized as XrawRn×ch×tX_{\rm raw}\in\mathbb R^{n\times ch\times t}, where nn is the number of epochs, ch=128ch=128 (channels), tt is the number of samples per epoch. Bandpass filtering (0.5–30 Hz) and notch filtering (50 Hz, Q=30) remove drifts and line noise (Bhardwaj et al., 20 Dec 2025).
  • Artifact Handling: Eye and muscle artifacts are addressed by independent component analysis (ICA) and component rejection.

Eye-Tracking Acquisition

  • Device: EyeLink 1000 Plus (SR Research), 500 Hz sampling, spatial accuracy <0.5°, calibrated with a 9-point grid before each block (Hollenstein et al., 2019, Hollenstein et al., 2021).
  • Recorded Variables: Gaze point coordinates (X, Y), pupil size, and fixation/saccade events.
  • Synchronization: Hardware-triggered TTL pulses ensure sub-millisecond alignment between EEG and eye-tracking streams (Δt\Delta t jitter <2 ms). Synchronization and correction procedures are formalized as taligned=tEEG+Δtt_{\mathrm{aligned}} = t_{\mathrm{EEG}} + \Delta t.

3. Preprocessing and Feature Extraction

EEG Pipeline

  • Artifact Removal:
    • Notch filtering (50 Hz)
    • ICA for ocular/muscular components (e.g., MARA algorithm)
    • Rejection of epochs with amplitudes exceeding 90 µV (Hollenstein et al., 2019)
  • Filtering: Zero-phase FIR bandpass (0.5–30 Hz, order ~4000), implemented as

Xbp(t)=k=0MbkXraw(tk),M4000X_{\rm bp}(t) = \sum_{k=0}^{M} b_k\,X_{\rm raw}(t-k), \quad M\approx4000

  • Epoching and Baseline Correction: Windowed around word onsets ([–200 ms, +600 ms]), with pre-stimulus mean subtracted:

Xep(t)Xep(t)1200ms200ms0Xep(τ)dτX_{\rm ep}(t) \leftarrow X_{\rm ep}(t) - \frac{1}{200\,\mathrm{ms}}\int_{-200\,\mathrm{ms}}^{0} X_{\rm ep}(\tau)\,d\tau

  • Normalization: Channel-wise z-scoring

Xi,jnorm=Xi,jμiσiX_{i,j}^{\rm norm} = \frac{X_{i,j}-\mu_i}{\sigma_i}

where μi\mu_i and σi\sigma_i are per-channel means and standard deviations across all epochs and times (Bhardwaj et al., 20 Dec 2025).

  • Band Power and Envelope Extraction: Signals filtered into standard bands (θ, α, β, γ). Instantaneous band amplitudes via Hilbert transform:

Pb=1T0TAb2(t)dtP_b = \frac{1}{T}\int_{0}^{T} A_b^2(t)\,dt

for band bb, with additional features such as frontal homologue differences (Pb(F3)Pb(F4)P_b(\mathrm{F3}) - P_b(\mathrm{F4})), and fixation-related potentials (FRPs) averaged [–200, +800] ms around each fixation (Hollenstein et al., 2019, Hollenstein et al., 2021).

  • Quality Control: Files exceeding strict noise or missing-data thresholds are excluded post hoc.

Eye-Tracking Pipeline

  • Fixation Detection: Velocity-based thresholding, omitting fixations <100 ms or outside text regions (>50 px from boundaries) (Hollenstein et al., 2019).
  • Word-Level Features: Extraction per word includes:
    • First Fixation Duration (FFD)
    • Gaze Duration (GD)
    • Total Reading Time (TRT)
    • Go-Past Time (GPT)
    • Number of fixations (nFix)
    • Saccade amplitude, duration, velocity
    • Mean and maximum pupil size per word
  • Sentence-Level Aggregates: Mean, SD, omission rate (O=#{wS:nFix(w)=0}LO = \frac{\#\{w \in S: nFix(w)=0\}}{L}), fixation density, and reading speed (Hollenstein et al., 2021).

4. Annotation Schema and Labeling Framework

  • Sentiment and Task Labels:
    • For sentiment analysis variants, binary polarity labels (positive/negative) are assigned to each sentence based on validated linguistic resources; all sub-word units inherit the parent sentence’s label (Bhardwaj et al., 20 Dec 2025).
    • For relation annotation, each TSR sentence is labeled for one of seven specific relations or as a control (no relation), enabling multi-class and binary classification setups (Hollenstein et al., 2019).
  • Event Markers: Comprehensive tagging includes word onsets, sentence boundaries, and condition labels, supporting precise alignment of physiological signals with linguistic units.
  • Design Considerations: Naturalistic stimuli (coherent sentences) allow investigation of both local (ERP at word onsets, e.g., P300) and integrative processes (e.g., late positivity for sentiment or semantic relations) (Bhardwaj et al., 20 Dec 2025).

5. Data Formats, Accessibility, and Metadata Structure

Folder/File Content Type Format Example
sub-XX/raw/ Continuous EEG/eye-tracking .bdf, .edf, .vhdr
sub-XX/epochs/ Preprocessed EEG epochs .mat (MATLAB), .npy
sub-XX/events.csv Event/trigger codes, labels, timing CSV
sub-XX/metadata.json Participant/session metadata JSON
sub-XX/eeg/, sub-XX/beh/ Raw and preprocessed EEG/behavioral files .set, .fdt, .tsv, .csv
Word/sentence feature tables Full feature sets per unit CSV

6. Research Applications, Use Cases, and Caveats

  • Applications:
    • Task classification using EEG and/or eye-tracking features, supporting within- and cross-subject generalization studies (Hollenstein et al., 2021).
    • Sentiment analysis from neural signals, with recent work employing encoder-based architectures (e.g., Feature Pyramid Networks with GRU backends) on ZUCO 2.0, yielding ≈6.88% performance gain over prior methods (Bhardwaj et al., 20 Dec 2025).
    • Assessment of semantic relation annotation, linking physiological metrics to annotation accuracy and cost (Hollenstein et al., 2019).
    • Cognitive load measurement: readers exhibit faster reading and higher omission rates in TSR, with corresponding task-dependent modulation of EEG band powers (NR > TSR for θ, α; TSR > NR for γ at fronto-central sites) (Hollenstein et al., 2021).
    • Coupled machine learning and neuroscience analyses, such as aligning LLMs’ embeddings or surprisal to fixation-related potentials or band power (Hollenstein et al., 2019).
  • Limitations:
    • Participant pool limited to 18 adult native English speakers; findings may not extrapolate to other populations (Hollenstein et al., 2019).
    • EEG spatial resolution is high at the sensor level (128 channels), yet source localization is not provided.
    • Single-session acquisition per subject; possible fatigue in latter blocks.
    • Annotation diversity in TSR restricted to seven relation types (for semantic relation tasks); alternative blocks are required for additional annotation paradigms (Hollenstein et al., 2019).
    • Eye-tracking accuracy assumes stable head position; non-lab environments may yield different ocular behavior.

7. Comparative and Methodological Insights

  • Distinction from ZuCo 1.0: ZUCO 2.0 employs within-session, interleaved blocks of NR and TSR for all participants to minimize session drift, unlike ZuCo 1.0 (separate sessions). All blocks are merged before preprocessing, reducing block-specific bias (Hollenstein et al., 2021).
  • Summary Statistics: Sentence-level reading speeds, omission rates, and fixation densities are provided for each condition (e.g., NR: 5.8±1.4 s/sentence, Omission 0.33±0.09; TSR: 4.8±2.0 s, Omission 0.45±0.14). EEG band-power means: θ ≈ 2.775 μV², α ≈ 2.475 μV², β ≈ 2.325 μV², γ ≈ 1.735 μV² (Hollenstein et al., 2021).
  • Classification Benchmarks: Within-subject task classification using sentence-level γ-band electrode features achieves median ≈92% accuracy, while cross-subject drops to ≈52%, illustrating the challenge of generalization due to inter-subject variance (Hollenstein et al., 2021).
  • Affective Computing Suitability: The dataset's high spatial and temporal resolution, word-aligned markers, and naturalistic yet controlled stimulus design make it a benchmark corpus for EEG-based sentiment analysis and broader cognitive-affective investigations (Bhardwaj et al., 20 Dec 2025).

References

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to ZUCO 2.0 Dataset.