LaBraM Encoder for EEG Representation

Updated 15 September 2025

LaBraM-based encoder is an advanced neural architecture that learns universal EEG representations via unsupervised pre-training on diverse EEG data.
It segments raw EEG signals into patches and applies vector-quantized spectral prediction with a modified Transformer to capture temporal and spatial dependencies.
Empirical evaluations on tasks like abnormal detection and event classification highlight its robust performance and scalability across varied EEG configurations.

The LaBraM-based encoder represents an advanced neural architecture specifically designed to learn generic representations from heterogeneous EEG signals in brain-computer interface (BCI) applications. Addressing the variability and limited scale typically associated with EEG datasets, the Large Brain Model (LaBraM) achieves universal perceptual capabilities through unsupervised pre-training on a diverse, sizable corpus of EEG data, and demonstrates robust performance across multiple downstream tasks including abnormal detection, event classification, emotion recognition, and gait prediction.

1. Architectural Framework of the LaBraM Encoder

LaBraM is constructed upon a neural Transformer architecture engineered for flexibility in handling raw EEG data, irrespective of channel count or sample length. Input EEG signals, represented as $X \in \mathbb{R}^{C \times T}$ where $C$ denotes the number of channels and $T$ denotes timestamps, are first segmented into patches via channel-wise fixed-length windows. Each patch undergoes temporal encoding through stacked 1-D convolutional blocks utilizing group normalization and GELU activation functions to extract temporal features. Output patch embeddings $e_{(c,k)} \in \mathbb{R}^{d}$ are subsequently enriched with learnable temporal ( $te_k$ ) and spatial ( $se_c$ ) positional embeddings:

$e_{(c_x,k)} + te_k + se_c$

This composite embedding sequence is transferred to a Transformer encoder employing patch-wise self-attention, with notable modifications such as layer normalization on queries and keys prior to attention and omission of bias terms. These adjustments are introduced to enhance training stability and speed, permitting the modeling of temporal dynamics and spatial relationships across varied electrode arrangements.

2. Channel Patch Segmentation and Standardization

Central to cross-dataset operation is the methodology for segmenting input EEG signals into channel-specific patches. Given the channel heterogeneity and variable sample durations that characterize EEG corpora, each channel signal is divided into non-overlapping windows of fixed length $w$ :

$x = \{ x_{(c_i, k)} \in \mathbb{R}^w\ |\ j = 1,\ldots,C;\ k = 1,\ldots,\left\lfloor \frac{t}{w} \right\rfloor \}$

This paradigm is reminiscent of techniques in vision models, where patch embeddings facilitate the standardization of inputs. Consequently, regardless of the original electrode topology or temporal duration, each EEG record is represented as a structured sequence of patches, thereby supporting uniform downstream modeling.

3. Neural Tokenization via Vector-Quantized Neural Spectrum Prediction

Prior to masked modeling, each patch embedding derived from the temporal encoder is discretized using vector-quantization. A codebook $\mathcal{V} = \{v_i\ |\ i = 1,\ldots,K\} \subset \mathbb{R}^{K \times D}$ is learned, and each patch representation $p_{i}$ is quantized to its nearest codebook entry:

$z_i = \arg\min_j \|\ell_2(p_i) - \ell_2(v_j)\|_2$

Distinctively, LaBraM eschews direct time-domain reconstruction in favor of neural spectrum reconstruction. The discrete Fourier transform (DFT) is applied to each patch $x$ of length $w$ :

$\tilde{x}_{(c,k)}^m = \sum_{n=1}^{N} x[n]\, \exp\left(-\frac{2\pi j}{N}\, m n\right)$

Amplitude $(A^m)$ and phase $(\phi^m)$ are computed for the spectral components:

$A^m = \sqrt{[\mathrm{Re}(\tilde{x}_{(c,k)}^m)]^2 + [\mathrm{Im}(\tilde{x}_{(c,k)}^m)]^2}$

$\phi^m = \arctan\left(\frac{\mathrm{Im}(\tilde{x}_{(c,k)}^m)}{\mathrm{Re}(\tilde{x}_{(c,k)}^m)}\right)$

A neural decoder, formed from Transformer blocks, predicts these spectral quantities, minimizing the mean square error between predictions and true spectrum values. Additional loss components govern codebook updates. This process compresses highly variable, noise-prone raw EEG signals into semantically expressive discrete neural codes that encode salient spectral features.

4. Masked EEG Modeling and Transformer Pre-Training

LaBraM applies a pre-training objective analogous to masked language modeling. A subset of EEG channel patches is randomly masked by replacement with a learned mask token $e_M$ . The masked sequence, with temporal and spatial positional information added, passes through the Transformer encoder which is tasked with recovering the original neural tokens. Denoting the masking configuration as $\mathcal{M} = \{ m_i \in \{0, 1\} \}$ , and Transformer output as $h_i$ , neural code class probabilities are given by:

$p(v_i'|\ e^\mathcal{M}) = \mathrm{softmax}(\mathrm{Linear}(h_i))$

The masked modeling loss function is:

$\mathcal{L}_\mathcal{M} = -\sum_{x \in \mathcal{D}} \sum_{(m_i=1)} \log p(v_i | e^\mathcal{M})$

A symmetric masking regime is introduced, where the primary mask and its complement are concurrently applied, enhancing data diversity and optimizing computational usage during training.

5. Empirical Performance Across Downstream Tasks

LaBraM demonstrates strong empirical results over a suite of BCI-relevant tasks. On the TUAB dataset for abnormal EEG detection, the LaBraM-Base model ($5.8$M parameters) reports the following metrics:

Task/Dataset	Balanced Accuracy	AUC-PR	AUROC
TUAB Abnormal Detection	81.40%	0.8965	0.9022

These scores surpass specialized state-of-the-art comparators. Performance gains also manifest in TUEV event type classification (balanced accuracy $64.09\%$ ) and are quantified by improvements in Cohen's Kappa and Weighted F1 metrics. LaBraM's larger variants (LaBraM-Large, LaBraM-Huge) yield reduced standard deviations and further enhanced results, signaling effective scaling. Notably, the approach generalizes well across disparate EEG configurations and tasks.

6. Addressing Heterogeneity and Low Signal-to-Noise Ratio

EEG analysis is traditionally challenged by the limited availability of labeled data, high inter-dataset heterogeneity, and pronounced noise. LaBraM leverages a large, diverse, unlabeled EEG corpus (over $2,500$ hours from $20$ datasets) during its unsupervised pre-training phase. Patch-wise segmentation, paired with adaptable spatial and temporal embeddings, allows the encoder to ingest any EEG montage and recording duration. The reliance on neural spectral tokenization rather than direct time-domain reconstruction enhances the encoder's capacity to distill physiologically meaningful features from noisy measurements.

7. Mathematical Formalization and Quantitative Descriptors

Key mathematical relations formalize the operations within the LaBraM-based encoder:

Number of patches: $|x| = C \left\lfloor \frac{t}{w} \right\rfloor$
Vector quantization for neural tokenization: $z_i = \arg\min_j ||\ell_2(p_i) - \ell_2(v_j)||_2$
DFT for spectral encoding: $\tilde{x}_{(c,k)}^m = \sum_{n=1}^{N} x[n] \exp\left(-\frac{2\pi j}{N} m n\right)$
Amplitude/phase: $A^m$ , $\phi^m$
Training objective: $\mathcal{L}_\mathcal{M} = - \sum_{x \in \mathcal{D}} \sum_{(m_i=1)} \log p(v_i | e^\mathcal{M})$

These quantified principles encode both the data transformation workflow and the learning objectives that underlie robust EEG representation discovery.

The LaBraM-based encoder synthesizes patch-wise segmentation, neural tokenization through spectral prediction, and masked Transformer modeling to achieve versatile, high-fidelity EEG representations. The architecture's capacity to generalize across varying data topologies and achieve state-of-the-art performance across diverse tasks demonstrates its utility and adaptability in EEG-based research and BCI applications.

PDF Markdown Chat (Pro)

Follow Topic

Get notified by email when new papers are published related to LaBraM-based Encoder.