GateFuseNet: Adaptive Fusion in PD Diagnosis
- GateFuseNet is a deep learning framework for adaptive multimodal fusion of neuroimaging data, integrating T1-weighted MRI, QSM, and ROI guidance to enhance PD diagnosis.
- It employs a hierarchical gated fusion module with anatomy-aware ROI residuals to selectively emphasize clinically relevant features and suppress irrelevant signals.
- The framework achieves state-of-the-art results, with 85.00% accuracy and a 0.9206 AUC, outperforming existing neuroimaging fusion architectures.
GateFuseNet is a deep learning framework for adaptive multimodal fusion of neuroimaging data in the diagnosis of Parkinson's disease (PD). It directly addresses limitations of conventional magnitude-based magnetic resonance imaging (MRI) approaches by integrating both T1-weighted anatomical MRI and Quantitative Susceptibility Mapping (QSM), a phase-based modality sensitive to iron deposition in deep gray matter structures implicated in PD pathology. The innovation centers on a hierarchical gated fusion (GF) module that operates jointly with an anatomy-aware region-of-interest (ROI) guidance mechanism to selectively enhance clinically relevant features while suppressing irrelevant signals. GateFuseNet attains state-of-the-art diagnostic accuracy and interpretability, outperforming existing neuroimaging fusion architectures in direct comparison (Jin et al., 26 Oct 2025).
1. Multimodal and ROI-aware Input Backbone
GateFuseNet processes three volumetric inputs: (1) QSM (), (2) T1-weighted MRI (), and (3) a binary deep gray matter (DGM) ROI mask (). The ROI mask targets nuclei with established clinical relevance in PD—substantia nigra (SN), putamen, caudate, globus pallidus (GP), and subthalamic nucleus (STN). Each modality undergoes an identical stem module: three stacked 3×3×3 convolutions (ELU activation, batch normalization) followed by 2×2×2 max pooling, producing feature representations for each incoming channel, with (Jin et al., 26 Oct 2025).
Subsequent layers include three successive fusion modules. Within each, the modality-specific features are processed through parallel CBAM-augmented bottlenecks, before being merged using the gated fusion mechanism. The fusion output links residually to the ROI branch specifically, forming a hierarchical anatomical anchor for progressive feature refinement.
2. Gated Fusion Mechanism
At each fusion stage , GateFuseNet employs an explicit attention-based adaptive fusion and channel-wise gating across modalities; this mechanism is central to the framework’s discriminative power (Jin et al., 26 Oct 2025).
- Attention-based Modality Fusion (AMF):
1. Concatenation of branch features along channels:
2. Three parallel grouped 3×3×3 convolutions with batch normalization and sigmoid produce modality-specific attention weights:
3. Weights are normalized per voxel to sum to one:
4. Spatial fusion:
- Channel-wise Gating (CWG):
A learnable gate vector is passed through a sigmoid, yielding gating values :
$g^\ell = \sigma(v^\ell) \tag{4}$
Fused features are gated channelwise:
- Hierarchical Residual Injection:
Only the ROI branch receives the gated fusion residually:
This hierarchical design progressively integrates fused information while preserving anatomical priors anchored by the ROI mask.
3. DGM ROI Masking and Anatomical Priors
ROI guidance is achieved via DGM masks, generated by atlas-based registration: QSM volumes are mapped to the MuSus-100 template, transferred to the AAL3 atlas in MNI space, and then inverse-warped to the native frame. The mask encodes the presence of the five targeted nuclei as a one-hot volume. It is directly input as a third modality—no explicit ROI-specific loss is used. Instead, the residual anchoring of the ROI branch at each GF stage operationalizes anatomical constraint throughout the network. This encourages prioritized fusion and feature enhancement within clinically meaningful regions (Jin et al., 26 Oct 2025).
4. Training Pipeline and Data Management
All volumes are resampled to mm and cropped or padded to voxels. Aggressive online augmentations include random affine transforms (rotation ±5°, translation ±2 voxels, scale in , probability 0.2), random bias-field corruption (coefficient 0.3, probability 0.1), and additive Gaussian noise (, probability 0.1).
The dataset comprises 316 subjects (161 PD, 155 healthy controls), with 64 samples reserved for independent test evaluation and five-fold cross-validation on the remaining 252 (80/20 split within each fold). Optimization employs binary focal loss: with , , . AdamW is used, with an initial learning rate , cosine-annealed to over 30 epochs, on 2×Tesla V100 GPUs (batch size 8). Model selection per fold is based on the sum of validation AUC and F1 score (Jin et al., 26 Oct 2025).
5. Quantitative Results and Comparative Evaluation
GateFuseNet demonstrates superior performance compared to prominent multimodal 3D classifiers:
| Model | Accuracy (%) | AUC | Precision (%) | Recall (%) | F1 (%) | Specificity (%) | AUPR |
|---|---|---|---|---|---|---|---|
| GateFuseNet | 85.00 | 0.9206 | 84.98 | 86.06 | 85.48 | 83.87 | 0.9227 |
| ResNeXt | 76.56 | 0.8594 | — | — | — | — | — |
| AG_SE_ResNeXt | 76.88 | 0.8831 | — | — | — | — | — |
| DenseFormer-MoE | 81.25 | 0.9084 | — | — | — | — | — |
GateFuseNet yields a +8.44% increase in accuracy and +6.12% in AUC compared to ResNeXt, and leads DenseFormer-MoE by +3.75% accuracy and +1.22% AUC. In ablation studies, the proposed gated fusion mechanism outperforms both weighted-average (76.68% accuracy, 0.8642 AUC) and simple concatenation (78.17%, 0.8823 AUC). Further, ROI-anchored fusion confers clear advantages: fusing within the ROI branch (default) attains 85.00% accuracy (0.9206 AUC) compared to T1 branch fusion (77.06%, 0.8761 AUC) or QSM branch fusion (78.43%, 0.8920 AUC) (Jin et al., 26 Oct 2025).
6. Model Interpretability and Clinical Plausibility
Qualitative evaluation via Grad-CAM reveals that GateFuseNet’s activations during inference are concentrated in the globus pallidus and substantia nigra, hallmark loci of iron accumulation and dopaminergic neuron loss in PD. The Grad-CAM-derived heatmaps align with the input ROI masks, substantiating that the network’s decision-making leverages anatomically valid, disease-relevant regions as opposed to trivial or spurious features. This interpretability is an explicit consequence of the ROI-guided residual gating pathway and the architectural coupling between attention-driven fusion and anatomical priors (Jin et al., 26 Oct 2025).
7. Conclusion and Data Accessibility
GateFuseNet establishes a robust adaptive pipeline for fusing QSM and T1-weighted MRI with explicit DGM anatomical priors through a hierarchical gated fusion paradigm. Its combination of a 3D backbone, GF blocks, ROI anchoring, and focal loss optimization yields state-of-the-art performance for PD discrimination (85.00% accuracy, 0.9206 AUC) while ensuring the transparency of diagnostic logic through clinically consistent attention mapping. Source codes and pretrained models are available at https://github.com/YangGaoUQ/GateFuseNet (Jin et al., 26 Oct 2025).