Data Normalization Strategies for EEG Deep Learning (2506.22455v1)
Abstract: Normalization is a critical yet often overlooked component in the preprocessing pipeline for EEG deep learning applications. The rise of large-scale pretraining paradigms such as self-supervised learning (SSL) introduces a new set of tasks whose nature is substantially different from supervised training common in EEG deep learning applications. This raises new questions about optimal normalization strategies for the applicable task. In this study, we systematically evaluate the impact of normalization granularity (recording vs. window level) and scope (cross-channel vs. within-channel) on both supervised (age and gender prediction) and self-supervised (Contrastive Predictive Coding) tasks. Using high-density resting-state EEG from 2,836 subjects in the Healthy Brain Network dataset, we show that optimal normalization strategies differ significantly between training paradigms. Window-level within-channel normalization yields the best performance in supervised tasks, while minimal or cross-channel normalization at the window level is more effective for SSL. These results underscore the necessity of task-specific normalization choices and challenge the assumption that a universal normalization strategy can generalize across learning settings. Our findings provide practical insights for developing robust EEG deep learning pipelines as the field shifts toward large-scale, foundation model training.