SS-MixNet: Spectral-Spatial Mixer Networks

Updated 26 November 2025

SS-MixNet is a spectral-spatial mixer network that fuses local feature extraction with global MLP mixers to enhance hyperspectral and MI-EEG classification.
It integrates 3D convolutions and depthwise attention to achieve superior per-pixel accuracy and robust cross-subject performance in low-label settings.
The architecture employs parallel spectral and spatial mixing strategies, yielding efficient computation and additive improvements over baseline models.

SS-MixNet (Spectral-Spatial Mixer Network) refers to a class of neural network architectures that exploit both spectral and spatial features through mixer-based designs. These models, prominent in hyperspectral image (HSI) analysis and motor imagery EEG (MI-EEG) classification, bridge local and long-range dependencies in multi-dimensional data and often employ mixers or multi-layer perceptrons (MLPs), convolutional layers, and lightweight attention mechanisms for efficient and effective representation learning. Notably, two representative but domain-distinct incarnations of SS-MixNet are detailed in the context of HSI classification (Alkhatib, 19 Nov 2025) and MI-EEG classification (Autthasan et al., 6 Sep 2024).

1. Architectural Principles

SS-MixNet architectures typically follow a three-stage pipeline, integrating local feature extraction, spectral-spatial mixing, and refined attention mechanisms.

Local Spectral-Spatial Feature Extraction: In HSI, local 3D convolutions with $3\times3\times3$ kernels produce low-level embeddings that preserve both spectral and spatial locality, capturing instance-specific patterns in the reduced band-patch volume ( $\mathbf{F} \in \mathbb{R}^{M \times M \times P \times D}$ , e.g., $M=9$ , $P=15$ , $D=32$ ) (Alkhatib, 19 Nov 2025).
Spectral and Spatial Mixers: Two parallel MLP-style mixer stacks operate along orthogonal axes:
- Spectral Mixers process feature tensors reshaped such that, for each spatial location, spectral relationships are mixed across bands via MLPs, typically stacked with residual connections and nonlinearities (GELU), with depth $L$ (e.g., $L=4$ ).
- Spatial Mixers permute the tensor to treat each spectral-channel pair as a token over spatial positions and apply similar MLP mixing.
Channel-wise Attention: Depthwise convolutional attention is applied on the concatenated outputs of the mixers. Each channel is modulated via a channel-specific convolution and a sigmoid gate, to emphasize informative features with minimal added parameters and computational overhead.

In MI-EEG, SS-MixNet denotes a pipeline combining traditional filter-bank common spatial patterns (FBCSP) for spectral-spatial feature construction, with a modern multi-task learning backbone (MIN2Net), integrating autoencoding, deep metric learning, and supervised classification with an adaptive gradient blending mechanism (Autthasan et al., 6 Sep 2024).

2. Methodological Components

Hyperspectral Image Classification Pipeline

Preprocessing: Input cubes $\mathbf{I}_{\text{orig}} \in \mathbb{R}^{H \times W \times C}$ are reduced in spectral dimension using PCA to $\mathbf{I}_{\text{red}} \in \mathbb{R}^{H \times W \times P}$ ( $P \ll C$ ), then partitioned into overlapping cubic patches $\mathbf{X} \in \mathbb{R}^{M \times M \times P}$ .
Network Details:
- Stacked 3D convolutions yield $\mathbf{F}$ .
- Spectral mixing applies a parallel two-layer MLP to each spatial location across $P$ bands, repeated $L$ times.
- Spatial mixing processes permutations of $\mathbf{F}$ across spatial locations, also with stacked MLPs.
- Concatenated feature tensors undergo depthwise channel-wise convolution, channel gating, and global average pooling before a final softmax classifier.

MI-EEG Spectral–Spatial Preprocessing and MixNet

FBCSP Feature Extraction: MI-EEG trials $X_i \in \mathbb{R}^{N_c \times t}$ are band-pass filtered into $N_b$ frequency bands. CSP is solved for each band to create class-conditional spatial filters; the spectral-spatial features are stacks of CSP-filtered time series across all bands, presented as $(1, t, U \cdot N_b)$ tensors.
Multi-Task Learning with MIN2Net: The architecture includes a convolutional autoencoder for reconstruction, a deep metric learning head using semi-hard triplet loss, and a supervised classification head. Adaptive gradient blending regulates task-specific loss contributions by tracking generalization-overfitting curves per task.

3. Training Protocols and Datasets

Datasets: Evaluated on QUH-Tangdaowan (18 land-cover classes, $\sim$ 200 bands, PCA-reduced to 15) and QUH-Qingyun (6 urban classes, PCA-reduced to 15).
Patch Extraction: $9\times9\times15$ cubes per pixel.
Splits: 1% training, 1% validation, 98% testing, stratified per class.
Optimization: Cross-entropy loss, Adam optimizer ( $10^{-3}$ initial learning rate), early stopping (patience 10). Batch size 64; convergence in typically 70–80 out of 100 epochs.

Datasets: Six standard MI-EEG datasets (BCIC-2a, BNCI2015, SMR-BCI, High-Gamma, OpenBMI, BCIC-IV-2b), evaluated in subject-dependent (SD) and subject-independent (SI) settings, with both high-density and 3-channel low-density montages.
Multi-Task Loss: Combined loss weighted adaptively

$\mathcal{L}(n) = \sum_{m=1}^3 w^{(m)}(n)\,\mathcal{L}^{(m)}(n)$

where $w^{(m)}(n)$ are normalized weights updated per-task and per-epoch via gradients of the validation and training loss trajectories.

4. Quantitative Results

HSI Classification

Dataset	SS-MixNet OA	Best Baseline OA	#Params	FLOPs
Tangdaowan	95.68%	94.50% (3D-CNN)	140,914	1.93M
Qingyun	93.86%	93.14% (IP-SWIN)

SS-MixNet achieves the highest overall accuracy (OA), average accuracy (AA), and kappa statistics on both benchmarks.
Per-class performance is optimal in the majority of classes (11/18 Tangdaowan classes).
The architecture exhibits sharper class boundaries and reduced salt-and-pepper noise in classification maps compared to all tested baselines (2D-CNN, 3D-CNN, IP-SWIN, SimPoolFormer, HybridKAN) (Alkhatib, 19 Nov 2025).
Computational complexity is minimized (0.77M MACs) while surpassing larger models such as SimPoolFormer (771K parameters, 57.5M FLOPs).

MI-EEG Classification

Dataset	SD Acc. (MixNet)	SI Acc. (MixNet)
BCIC-2a	77.6% ±15.1	69.4% ±11.8
BNCI2015	80.0% ±13.2	66.2% ±11.7
SMR-BCI	76.3% ±16.5	66.1% ±14.0
High-Gamma	80.2% ±15.3	69.8% ±10.8
OpenBMI	68.9% ±16.8	72.0% ±14.2

For three-channel EEG (BCIC-IV-2b), MixNet achieves 77.1% (SD) and 75.7% (SI) accuracy, outperforming all reference models by 2–10% average (Autthasan et al., 6 Sep 2024).

5. Component Contribution and Ablation Findings

HSI Ablation:
- Adding the spectral mixer to the 3D-CNN baseline raises OA from 94.20% to 95.07%.
- Adding only the spatial mixer provides 94.89% OA.
- Combining both mixers achieves 95.38% OA.
- The full model, including the channel-wise depthwise attention, attains the final OA of 95.68% (Alkhatib, 19 Nov 2025).
- This indicates strictly additive benefits of each module.
MI-EEG Ablation:
- The FBCSP spectral-spatial preprocessing is essential for encoding discriminative patterns.
- The MIN2Net multi-task module, with adaptive weighting, provides robust generalization across dense and sparse EEG montages.

6. Implementation and Reproducibility

Reference code for HSI SS-MixNet is scheduled for public release at https://github.com/mqalkhatib/SS-MixNet and is structured for exact reproducibility (fixed seeds, comprehensive data and model scripts) (Alkhatib, 19 Nov 2025).

Main modules: data_loader.py (augmentation, PCA, extraction, splitting), models/ss_mixnet.py (architecture), train.py (training, scheduling), evaluate.py (metrics and visualization).
Dependencies include TensorFlow 2.10.0, numpy, and scikit-learn.

Reproduction command:

1	python train.py --dataset=Tangdaowan --epochs=100 --batch_size=64

For MI-EEG MixNet, the architecture, loss functions, and adaptive blending algorithm are fully specified, and evaluation benchmarks and ablation settings are comprehensively reported, enabling replication and further extension (Autthasan et al., 6 Sep 2024).

7. Context and Significance

SS-MixNet architectures advance the state of the art by integrating local, high-dimensional feature extraction with global mixer-based modeling, while maintaining computational tractability. In HSI, this yields robust per-pixel classification under extremely low supervision (1% labeled data). In MI-EEG, the combination of classical spectral–spatial methods with multi-task deep learning and adaptive loss blending enhances cross-subject generalization and supports low-density EEG settings. These methodologies demonstrate strict improvements over baselines using either classical, CNN, or transformer-based alternatives. A plausible implication is that spectral-spatial mixer paradigms, as exemplified by SS-MixNet, can serve as a blueprint for efficient, high-accuracy modeling in other domains characterized by multimodal or multi-axis structure.

References:

(Alkhatib, 19 Nov 2025) Hyperspectral Image Classification using Spectral-Spatial Mixer Network
(Autthasan et al., 6 Sep 2024) MixNet: Joining Force of Classical and Modern Approaches Toward the Comprehensive Pipeline in Motor Imagery EEG Classification

PDF Markdown Chat (Pro)

References (2)

Hyperspectral Image Classification using Spectral-Spatial Mixer Network (2025)

MixNet: Joining Force of Classical and Modern Approaches Toward the Comprehensive Pipeline in Motor Imagery EEG Classification (2024)

SS-MixNet: Spectral-Spatial Mixer Networks

1. Architectural Principles

2. Methodological Components

Hyperspectral Image Classification Pipeline

MI-EEG Spectral–Spatial Preprocessing and MixNet

3. Training Protocols and Datasets

HSI SS-MixNet (Alkhatib, 19 Nov 2025)

MI-EEG SS-MixNet (Autthasan et al., 6 Sep 2024)

4. Quantitative Results

HSI Classification

MI-EEG Classification

5. Component Contribution and Ablation Findings

6. Implementation and Reproducibility

7. Context and Significance

Whiteboard

Follow Topic

Continue Learning

SS-MixNet: Spectral-Spatial Mixer Networks

1. Architectural Principles

2. Methodological Components

Hyperspectral Image Classification Pipeline

MI-EEG Spectral–Spatial Preprocessing and MixNet

3. Training Protocols and Datasets

HSI SS-MixNet (Alkhatib, 19 Nov 2025)

MI-EEG SS-MixNet (Autthasan et al., 6 Sep 2024)

4. Quantitative Results

HSI Classification

MI-EEG Classification

5. Component Contribution and Ablation Findings

6. Implementation and Reproducibility

7. Context and Significance

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics