DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG (1703.04046v2)

Published 12 Mar 2017 in stat.ML

Abstract: The present study proposes a deep learning model, named DeepSleepNet, for automatic sleep stage scoring based on raw single-channel EEG. Most of the existing methods rely on hand-engineered features which require prior knowledge of sleep analysis. Only a few of them encode the temporal information such as transition rules, which is important for identifying the next sleep stages, into the extracted features. In the proposed model, we utilize Convolutional Neural Networks to extract time-invariant features, and bidirectional-Long Short-Term Memory to learn transition rules among sleep stages automatically from EEG epochs. We implement a two-step training algorithm to train our model efficiently. We evaluated our model using different single-channel EEGs (F4-EOG(Left), Fpz-Cz and Pz-Oz) from two public sleep datasets, that have different properties (e.g., sampling rate) and scoring standards (AASM and R&K). The results showed that our model achieved similar overall accuracy and macro F1-score (MASS: 86.2%-81.7, Sleep-EDF: 82.0%-76.9) compared to the state-of-the-art methods (MASS: 85.9%-80.5, Sleep-EDF: 78.9%-73.7) on both datasets. This demonstrated that, without changing the model architecture and the training algorithm, our model could automatically learn features for sleep stage scoring from different raw single-channel EEGs from different datasets without utilizing any hand-engineered features.

Authors (4)

Akara Supratak (4 papers)
Hao Dong (175 papers)
Chao Wu (137 papers)
Yike Guo (144 papers)

Citations (888)

View on Semantic Scholar

Summary

DeepSleepNet: A Model for Automatic Sleep Stage Scoring Based on Raw Single-Channel EEG

"DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG" introduces a novel architecture designed to address the challenge of automatic sleep stage scoring using raw single-channel EEG data. The model, termed DeepSleepNet, leverages the strengths of Convolutional Neural Networks (CNNs) and bidirectional Long Short-Term Memory (bi-LSTM) networks to perform end-to-end feature learning and sequence modeling without relying on hand-engineered features that are typically dataset-specific.

Model Architecture and Approach

Representation Learning

DeepSleepNet consists of two primary components: representation learning and sequence residual learning. The representation learning part employs dual CNNs with differing filter sizes in their initial layers. This dual-CNN setup strikes a balance between capturing temporal information (using small filters) and frequency information (using large filters) from raw EEG data. The idea is that smaller filters are adept at identifying temporal features, while larger filters excel at capturing frequency components.

Sequence Residual Learning

The sequence residual learning component integrates bi-LSTMs with additional residual connections to encode temporal dependencies among sleep stages. The bi-LSTMs learn the transitional rules that sleep experts use, thereby modeling long-term dependencies in the EEG sequences. This part of the model ensures that not just per-epoch features but also the sequential context are leveraged for better classification accuracy.

Training Methodology

To address the class imbalance inherent in sleep datasets, a two-step training algorithm is employed:

Pre-training: The representation learning component (dual-CNNs) is pre-trained on an oversampled class-balanced training set to mitigate the class imbalance issue.
Fine-tuning: The entire model (representation learning followed by sequence residual learning) is fine-tuned using a sequential training set, enabling the network to encode temporal dependencies effectively.

Evaluation

The model was evaluated using two public datasets: MASS and Sleep-EDF, employing the F4-EOG (Left) channel from MASS and the Fpz-Cz and Pz-Cz channels from Sleep-EDF. Performance metrics reported include overall accuracy (ACC), macro F1-score (MF1), and Cohen's Kappa ( $\kappa$ ). The results were benchmarked against state-of-the-art approaches.

MASS (F4-EOG Left): Achieved ACC of 86.2%, MF1 of 81.7, and $\kappa$ of 0.80.
Sleep-EDF (Fpz-Cz): Achieved ACC of 82.0%, MF1 of 76.9, and $\kappa$ of 0.76.

These metrics indicate that DeepSleepNet’s performance is competitive with existing methods that rely on hand-engineered features.

Key Findings

Feature Learning: DeepSleepNet can automatically learn relevant features directly from raw EEG data without the need for hand-engineering. The dual-CNN architecture is effective in capturing diverse features across different sleep stages.
Temporal Modeling: Incorporating bi-LSTMs allows the model to leverage temporal dependencies, which significantly improves classification performance. The representation of sequence information is apparent in how the model handles transitions between stages.
Class Imbalance: The two-step training approach effectively mitigates the class imbalance problem often encountered in sleep datasets, bolstering model robustness across varying distributions.

Implications and Future Work

The implications of DeepSleepNet are twofold:

Practical: The automated feature learning from raw single-channel EEG data paves the way for more accessible sleep stage scoring systems, potentially facilitating remote health monitoring applications.
Theoretical: The integration of dual-CNNs and bi-LSTMs within a single framework demonstrates the viability of deep learning approaches that combine temporal and spatial feature extraction for time-series classification tasks.

To further strengthen the model, future work could focus on enhancing the generalizability of DeepSleepNet by incorporating more diverse datasets and exploring transfer learning techniques. Additionally, real-time implementation on low-cost, portable EEG devices could transform sleep monitoring practices, making them more user-friendly and pervasive.

PDF Markdown