Papers
Topics
Authors
Recent
Search
2000 character limit reached

GC-VASE: EEG Subject Identification Model

Updated 21 January 2026
  • The paper introduces GC-VASE, integrating GCNNs, split latent VAEs, and attention-based adapters to significantly improve EEG subject identification.
  • It employs contrastive learning and adapter-based fine-tuning, achieving up to 90.31% accuracy on ERP-Core across diverse EEG paradigms.
  • The framework fosters efficient personalization with minimal parameter updates, making it ideal for biometric and brain-computer interface applications.

GC-VASE (Graph Convolutional Variational Autoencoder with Split Latent Space and Attention-Based Adapters) is a deep learning framework designed for robust subject representation learning from electroencephalography (EEG) data. GC-VASE integrates graph convolutional neural networks (GCNNs), variational autoencoders (VAEs) with split latent spaces, and contrastive learning, achieving state-of-the-art results in subject identification while enabling efficient subject-adaptive fine-tuning through attention-based adapter modules. The architecture excels across large-scale EEG datasets, notably ERP-Core and SleepEDFx-20, and demonstrates adaptability, efficiency, and interpretability in scenarios involving new, unseen subjects (Mishra et al., 13 Jan 2025).

1. Model Architecture

GC-VASE models EEG sensor topology as a graph G=(V,E)G = (V, E) where each of the NN EEG channels is a node, and the adjacency matrix ARN×NA \in \mathbb{R}^{N\times N} encodes physical or functional connectivity. The adjacency is symmetrically normalized via A~=D^1/2(A+I)D^1/2\widetilde{A} = \widehat{D}^{-1/2} (A+I) \widehat{D}^{-1/2} with D^=diag(jA^ij)\widehat{D} = \mathrm{diag}\bigl(\sum_j \widehat{A}_{ij}\bigr). Four stacked GCNN layers (with ReLU activations) propagate node information, after which a global average pooling yields a compact feature vector.

The encoder EθE_\theta subsequently reshapes this vector sequence and applies four Transformer encoder layers, producing two parameter sets: (μS,σS)(\mu^S, \sigma^S) for subject-specific latent zSRdSz^S \in \mathbb{R}^{d_S} and (μT,σT)(\mu^T, \sigma^T) for residual/task latent zTRdTz^T \in \mathbb{R}^{d_T}, sampled via the reparameterization trick. The total latent dimensionality is set to 64, split between zSz^S and zTz^T. The decoder DϕD_\phi comprises mirrored Transformer and GCNN layers reconstructing the input XX from (zS,zT)(z^S, z^T).

Adaptation to unseen subjects is achieved through attention-based adapter networks, inserted post-encoder. Each adapter comprises a multi-head self-attention (eight heads) layer followed by a feed-forward block, with only adapter weights updated during subject-adaptive fine-tuning, significantly reducing computational cost.

2. Mathematical Formulation

Graph convolution is formalized as H(l+1)=σ(A~H(l)W(l))H^{(l+1)} = \sigma(\widetilde{A}\, H^{(l)} W^{(l)}) where H(l)RN×FlH^{(l)} \in \mathbb{R}^{N\times F_l}, W(l)W^{(l)} are layer weights, and σ\sigma is ReLU. The spectral GCN perspective is gθxUgθ(Λ)UTxg_\theta \star x \approx U\, g_\theta(\Lambda)\, U^T x, with L=ID1/2AD1/2=UΛUTL = I - D^{-1/2} A D^{-1/2} = U \Lambda U^T.

The training objective is the VAE evidence lower bound (ELBO):

LELBO=Eqϕ(zX)[logpθ(Xz)]DKL(qϕ(zX)p(z))L_{\mathrm{ELBO}} = \mathbb{E}_{q_\phi(z|X)} [\log p_\theta(X|z)] - D_{KL}(q_\phi(z|X)\|p(z))

where the Kullback-Leibler divergence is separately computed for subject and residual latents (KLSKL^S, KLTKL^T). Mean squared error (MSE) is used for reconstruction loss: Lrec=XX^22L_\mathrm{rec} = \|X - \hat{X}\|_2^2.

Contrastive learning employs an NT-Xent/CLIP formulation over each latent space:

LNTXent(k)=logexp(sim(zk,zk)/τ)ikexp(sim(zk,zi)/τ)L_{\mathrm{NT-Xent}}(k) = -\log \frac{\exp(\mathrm{sim}(z'_k, z''_k)/\tau)}{\sum_{i\neq k}\exp(\mathrm{sim}(z'_k, z''_i)/\tau)}

with LCLIPL_{\mathrm{CLIP}} averaging per split. Separate contrastive losses operate on the subject and residual latents (LCLIPSL_{\mathrm{CLIP}}^S, LCLIPTL_{\mathrm{CLIP}}^T).

Total loss:

L=Lrec+βSKLS+βTKLT+λSLCLIPS+λTLCLIPTL = L_\mathrm{rec} + \beta_S KL^S + \beta_T KL^T + \lambda_S L^S_{\mathrm{CLIP}} + \lambda_T L^T_{\mathrm{CLIP}}

where βS=βT=1\beta_S = \beta_T = 1 and λS,λT\lambda_S, \lambda_T are validation-tuned.

3. Contrastive and Split-Latent Learning

Within each batch, KK subjects are selected and two non-overlapping EEG epochs per subject are sampled, resulting in $2K$ samples. Positive pairs in LCLIPSL^S_{\mathrm{CLIP}} share a subject (but not necessarily a task), while LCLIPTL^T_{\mathrm{CLIP}} uses task-matched pairs. Negative pairs comprise different subjects or tasks. This split-latent/contrastive strategy enables disentanglement of subject-specific identity from residual variation, directly enhancing identification performance.

Parallel computation of contrastive losses and VAE ELBO accelerates convergence and improves generalization, as shown in subsequent ablation studies. Removing the split-latent design, GCNN layers, or contrastive learning each decreases subject identification accuracy by 8–9% absolute on ERP-Core.

4. Adapter-Based Subject Adaptive Fine-Tuning

Subject-specific transfer is handled by adapter modules placed after the final Transformer encoder layer, each containing a multi-head self-attention block and a two-layer feed-forward subnetwork (with ReLU), wrapped in residual and layer-norm connections as in standard Transformers. During adaptation, the core encoder and decoder parameters are frozen; only adapters are updated with 20 epochs of fine-tuning (batch size 256, learning rate 1e−4). Uniquely, this procedure updates just ~1% of the parameters, enabling efficient and scalable personalization to unseen subjects with minimal computational overhead.

For new subject adaptation, 70% of the data are used for fine-tuning adapters, with performance evaluated on the remaining 30%. This approach permits rapid deployment to new individuals without full retraining.

5. Empirical Results and Comparative Evaluation

GC-VASE achieves state-of-the-art performance for subject identification on two benchmarks:

  • ERP-Core (40 subjects, 6 ERP paradigms, 1s epochs, 30 channels): 89.81% subject balanced accuracy (zero-shot), exceeding CSLP-AE by 9.49%. After adapter-based fine-tuning, accuracy rises to 90.31%.
  • SleepEDFx-20 (20 subjects, 30s windows): 70.85% subject balanced accuracy, outperforming CSLP-AE (67.55%) and LaBraM (59.42%).

Paradigm-wise ERP-Core balanced accuracy reveals maximal performance on N400 (98.87%), and moderate to low accuracy on P3 (57.67%), ERN (59.89%), N2pc (50.49%), N170 (36.32%), and MMN (41.03%).

Ablation studies indicate strong dependence on the presence of GCNN layers, contrastive loss, and the split-latent VAE structure, with the most severe degradation upon omission of contrastive learning (8.97%-8.97\%).

Table: Summary of Comparative Results

Dataset GC-VASE (Zero-shot) After Adapter Fine-tuning CSLP-AE LaBraM
ERP-Core 89.81% 90.31% 80.32% n/a
SleepEDFx-20 70.85% n/a 67.55% 59.42%

6. Applications, Limitations, and Future Directions

GC-VASE is positioned for biometric identification, personalized brain-computer interface design, and precision diagnostics. Its modular fine-tuning mechanism and robust subject representations facilitate deployment in settings with evolving user populations.

A primary limitation is the reliance on explicit graph connectivity design (adjacency matrix), which can be sensitive to hyperparameters such as τ\tau (temperature) and split size allocations for the latent spaces. Subject adaptation necessitates limited per-user data, though computational cost is minimal.

Proposed future directions include (i) integrating knowledge distillation from large-scale, self-supervised EEG foundation models, (ii) exploring dynamic, time-varying graphs for sensor relationships, and (iii) achieving zero-shot adaptation via meta-learning or prompt-style adapters (Mishra et al., 13 Jan 2025).

A plausible implication is that advances in these directions could further enhance adaptability and generalization in cross-population or real-time settings, subject to continuing research in graph-based representation learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to GC-VASE.