CA-EEGWaveNet: Adaptive iEEG Seizure Detection
- CA-EEGWaveNet is a channel-adaptive deep learning architecture that integrates single-channel WaveNet models with a holographic fusion module and extended temporal context for clinical-grade seizure detection.
- Its modular design adapts to heterogeneous, multi-channel iEEG data by employing subject-agnostic embeddings and channel-specific trainable scalars for efficient spatial encoding.
- Empirical validation shows improved median F1-scores and sensitivity over baseline models through effective pre-training and rapid subject-specific fine-tuning.
CA-EEGWaveNet is a channel-adaptive deep learning architecture designed for high-precision classification and detection tasks in intracranial EEG (iEEG) analysis, specifically targeting heterogeneous, multi-channel, and multi-subject environments. By composing state-of-the-art single-channel WaveNet backbones, a unique holographic vector-symbolic fusion module, and a long-context memory block, CA-EEGWaveNet achieves both robust channel adaptivity and clinical-grade seizure detection performance. Critical design elements include subject-agnostic embedding, plug-and-play fusion, and the extension of temporal context to match real-world clinical review practices. Empirical validation demonstrates superiority over baseline models in median F1-score and adaptability across datasets of varying channel configuration and quantity (Carzaniga et al., 22 Dec 2025, Laar et al., 10 Oct 2025).
1. Architectural Components and Data Flow
CA-EEGWaveNet is structured into three interchangeable modules:
- Encoder: Each individual EEG channel, denoted as for channel and window , is processed independently by a single-channel EEGWaveNet encoder. This encoder employs seven residual convolutional blocks with dilated causal Conv1D layers, Swish activations, and both residual and skip connections. The encoder generates per-channel feature vectors (Laar et al., 10 Oct 2025, Carzaniga et al., 22 Dec 2025).
- Fusion Module: The set of per-channel vectors for a time window are fused by a trainable vector-symbolic holographic reduced representation (HRR) mechanism. Fusion employs a single learnable scalar per channel and a fixed basis vector , producing channel-wise angular rotations:
The fused vector for window is then
where denotes circular convolution, efficiently computed via FFTs (Carzaniga et al., 22 Dec 2025).
- Memory Block: The sequence of fused vectors across consecutive windows (, each $7.5$ s with $1$ s stride for $105$ s effective context) is accumulated using a Temporal Convolutional Network (TCN):
This enables joint reasoning over up to two minutes of EEG, offering temporally extended analysis for event classification or detection (Carzaniga et al., 22 Dec 2025).
This compositional design allows any state-of-the-art single-channel model to be adopted as the Encoder, with the channel-adaptive and memory modules generalizing across channel and subject heterogeneity.
2. Channel Adaptivity, Spatial Encoding, and Fusion
A defining element of CA-EEGWaveNet is its architectural invariance with respect to channel count, ordering, and subject montage. Channel-specific trainable scalars bind spatial (electrode) information within the fusion mechanism, enabling the model to reconstruct functional and spatial relationships among channels without imposing architectural constraints (Carzaniga et al., 22 Dec 2025). Empirical analysis of the cosine similarities between angularly rotated basis vectors, post-training,
reveals structured inter-channel relationships aligning with electrode proximity and connectivity. For arbitrary channel configurability, re-initialization and retraining of the scalars suffice, obviating the need for encoder or memory redesign.
3. Training Protocols and Optimization
Training proceeds in two stages:
- Pre-training on Heterogeneous Data: CA-EEGWaveNet is pre-trained on large, cross-subject datasets (e.g. 16–18 subjects totaling up to 2321 hours and 244 seizures), using random subsampling of subjects and uniform batch draws. Model parameters are optimized with Adam (learning rate ) to maximize F1-score, with early stopping on plateaus. Pre-training concludes after a minimum of 25 and maximum of 50 epochs (6 hours on A100 GPU for 15 subjects) (Carzaniga et al., 22 Dec 2025).
- Subject-Specific Fine-Tuning: Two fine-tuning regimes are defined—Leave-One-Out over seizures (LOOC), using all but one seizure per subject for fine-tuning, and Leave-All-But-One-Out (LABOC), using only a single seizure for rapid personalization. Fine-tuning updates all and downstream TCN weights, typically for up to 10 epochs, leveraging the model’s efficient reuse of pre-trained backbone weights.
For baseline EEGWaveNet models, per-subject LOOC training is required without the benefit of large-scale pre-training or channel adaptivity.
4. Evaluation, Benchmarking, and Ablation
Quantitative evaluation compares CA-EEGWaveNet against baseline EEGWaveNet across short-term (20h) and long-term (2500h) iEEG datasets. Performance, as measured by median seizure-level F1-score and sensitivity, demonstrates:
| Model | Median (Short-Term) | Sensitivity (Short-Term) | Median (Long-Term) | Sensitivity (Long-Term) |
|---|---|---|---|---|
| EEGWaveNet (baseline) | 0.82 | 0.84 | 0.76 | 0.78 |
| CA-EEGWaveNet | 0.85 | 0.87 | 0.78 | 0.80 |
A similar pattern holds for CA-EEGNet backbones. Ablation studies confirm that removal of pre-training, fusion, or memory blocks severely degrades performance ( falls to ), demonstrating the necessity of all three components for robust detection (Carzaniga et al., 22 Dec 2025).
5. Dataset Handling, Preprocessing, and Regularization
CA-EEGWaveNet is agnostic to the number of EEG channels and input format, provided data can be partitioned into overlapping temporal windows. Raw EEG signals are divided by channel and windowed, with no explicit requirement for channel alignment or prior spatial registration. Preprocessing includes conversion to efficient HDF5 storage, dynamic random key–based partitioning to avoid data leakage across train/val/test splits, and per-segment Z-score normalization for each channel/segment:
with for numerical stability. No further band-pass filtering is applied; models are trained on raw normalized input (Laar et al., 10 Oct 2025).
Regularization mirrors that of the underlying backbone: adaptive dropout (annealing based on a composite validation score), L2 weight decay, and the integration of focal loss (with , per-class weights inversely proportional to class frequency) for handling class imbalance.
6. Comparative Analysis and Limitations
Relative to non-channel-adaptive models, CA-EEGWaveNet provides:
- Robust cross-subject transferability, supporting arbitrary and heterogeneous electrode montages.
- Rapid subject-specific fine-tuning using minimal data (1/5 the time required relative to baseline training approaches).
- Clinically relevant temporal context, extending event reasoning to 100 s.
- Model compactness (0.995 million parameters, %%%%3435%%%% smaller than state-of-the-art Transformer approaches).
Limitations observed include saturation of performance when pre-training exceeds 15 subjects, indicating an upper bound for benefit under present encoder capacity and regularization. The architecture awaits validation on non-invasive (scalp EEG) and related multivariate biosignal modalities. Adaptive strategies for dynamical updates of as contacts change—e.g., due to repositioning or artifact-removal—remain an open research direction (Carzaniga et al., 22 Dec 2025).
7. Implications and Future Directions
CA-EEGWaveNet's composable channel-adaptive design enables the leveraging of extensive, heterogeneous iEEG corpora for state-of-the-art event detection and classification. The modular structure admits substitution of advanced backbones or fusion mechanisms, with immediate inheritance of channel adaptivity and extended memory capabilities. This suggests substantial headroom for further improvements in model generalization and clinical deployment, contingent on advances in encoding architectures and the systematic integration of diverse biosignal datasets (Carzaniga et al., 22 Dec 2025).