Papers
Topics
Authors
Recent
Search
2000 character limit reached

GateFuseNet: Adaptive Fusion in PD Diagnosis

Updated 19 March 2026
  • GateFuseNet is a deep learning framework for adaptive multimodal fusion of neuroimaging data, integrating T1-weighted MRI, QSM, and ROI guidance to enhance PD diagnosis.
  • It employs a hierarchical gated fusion module with anatomy-aware ROI residuals to selectively emphasize clinically relevant features and suppress irrelevant signals.
  • The framework achieves state-of-the-art results, with 85.00% accuracy and a 0.9206 AUC, outperforming existing neuroimaging fusion architectures.

GateFuseNet is a deep learning framework for adaptive multimodal fusion of neuroimaging data in the diagnosis of Parkinson's disease (PD). It directly addresses limitations of conventional magnitude-based magnetic resonance imaging (MRI) approaches by integrating both T1-weighted anatomical MRI and Quantitative Susceptibility Mapping (QSM), a phase-based modality sensitive to iron deposition in deep gray matter structures implicated in PD pathology. The innovation centers on a hierarchical gated fusion (GF) module that operates jointly with an anatomy-aware region-of-interest (ROI) guidance mechanism to selectively enhance clinically relevant features while suppressing irrelevant signals. GateFuseNet attains state-of-the-art diagnostic accuracy and interpretability, outperforming existing neuroimaging fusion architectures in direct comparison (Jin et al., 26 Oct 2025).

1. Multimodal and ROI-aware Input Backbone

GateFuseNet processes three volumetric inputs: (1) QSM (XQSMRD×H×WX_{\mathrm{QSM}}\in\mathbb{R}^{D\times H\times W}), (2) T1-weighted MRI (XT1RD×H×WX_{\mathrm{T1}}\in\mathbb{R}^{D\times H\times W}), and (3) a binary deep gray matter (DGM) ROI mask (XROI{0,1}D×H×WX_{\mathrm{ROI}}\in\{0,1\}^{D\times H\times W}). The ROI mask targets nuclei with established clinical relevance in PD—substantia nigra (SN), putamen, caudate, globus pallidus (GP), and subthalamic nucleus (STN). Each modality undergoes an identical stem module: three stacked 3×3×3 convolutions (ELU activation, batch normalization) followed by 2×2×2 max pooling, producing feature representations xm0x_m^0 for each incoming channel, with m{ROI,QSM,T1}m \in \{\mathrm{ROI},\mathrm{QSM},\mathrm{T1}\} (Jin et al., 26 Oct 2025).

Subsequent layers include three successive fusion modules. Within each, the modality-specific features are processed through parallel CBAM-augmented bottlenecks, before being merged using the gated fusion mechanism. The fusion output links residually to the ROI branch specifically, forming a hierarchical anatomical anchor for progressive feature refinement.

2. Gated Fusion Mechanism

At each fusion stage \ell, GateFuseNet employs an explicit attention-based adaptive fusion and channel-wise gating across modalities; this mechanism is central to the framework’s discriminative power (Jin et al., 26 Oct 2025).

  • Attention-based Modality Fusion (AMF):

1. Concatenation of branch features along channels:

X=[xROIxQSMxT1]X^\ell = [x_{\mathrm{ROI}}^\ell \,\|\, x_{\mathrm{QSM}}^\ell \,\|\, x_{\mathrm{T1}}^\ell]

2. Three parallel grouped 3×3×3 convolutions with batch normalization and sigmoid produce modality-specific attention weights:

am=σ(BN(Conv33(X))),m{ROI,QSM,T1}(1)a_m^\ell = \sigma (\mathrm{BN}(\mathrm{Conv}_{3^3}(X^\ell))), \quad m\in\{\mathrm{ROI},\mathrm{QSM},\mathrm{T1}\} \tag{1}

3. Weights are normalized per voxel to sum to one:

mam(i,j,k)=1, i,j,k(2)\sum_{m} a_m^\ell(i,j,k) = 1, \ \forall\,i,j,k \tag{2}

4. Spatial fusion:

f=mamxm(3)f^\ell = \sum_{m} a_m^\ell \odot x_m^\ell \tag{3}

  • Channel-wise Gating (CWG):
    • A learnable gate vector vRCv^\ell\in\mathbb{R}^{C_\ell} is passed through a sigmoid, yielding gating values g(0,1)Cg^\ell\in(0,1)^{C_\ell}:

      $g^\ell = \sigma(v^\ell) \tag{4}$

    • Fused features are gated channelwise:

      f~=gf\tilde f^\ell = g^\ell \odot f^\ell

  • Hierarchical Residual Injection:
    • Only the ROI branch receives the gated fusion residually:

      xROI+1=xROI+f~,xQSM+1=xQSM,xT1+1=xT1(5)x_{\mathrm{ROI}}^{\ell+1} = x_{\mathrm{ROI}}^\ell + \tilde f^\ell, \quad x_{\mathrm{QSM}}^{\ell+1} = x_{\mathrm{QSM}}^\ell, \quad x_{\mathrm{T1}}^{\ell+1} = x_{\mathrm{T1}}^\ell \tag{5}

    • This hierarchical design progressively integrates fused information while preserving anatomical priors anchored by the ROI mask.

3. DGM ROI Masking and Anatomical Priors

ROI guidance is achieved via DGM masks, generated by atlas-based registration: QSM volumes are mapped to the MuSus-100 template, transferred to the AAL3 atlas in MNI space, and then inverse-warped to the native frame. The mask encodes the presence of the five targeted nuclei as a one-hot volume. It is directly input as a third modality—no explicit ROI-specific loss is used. Instead, the residual anchoring of the ROI branch at each GF stage operationalizes anatomical constraint throughout the network. This encourages prioritized fusion and feature enhancement within clinically meaningful regions (Jin et al., 26 Oct 2025).

4. Training Pipeline and Data Management

All volumes are resampled to 1×1×11\times1\times1mm3^3 and cropped or padded to 1283128^3 voxels. Aggressive online augmentations include random affine transforms (rotation ±5°, translation ±2 voxels, scale in [0.9,1.1][0.9,1.1], probability 0.2), random bias-field corruption (coefficient 0.3, probability 0.1), and additive Gaussian noise (σ=0.02\sigma=0.02, probability 0.1).

The dataset comprises 316 subjects (161 PD, 155 healthy controls), with 64 samples reserved for independent test evaluation and five-fold cross-validation on the remaining 252 (80/20 split within each fold). Optimization employs binary focal loss: Lfocal=α(1p)γylogp(1α)pγ(1y)log(1p)(6)L_\text{focal} = -\alpha (1-p)^\gamma y \log p - (1-\alpha)p^\gamma (1-y)\log(1-p) \tag{6} with α=0.5\alpha=0.5, γ=2\gamma=2, p=σ(z)p=\sigma(z). AdamW is used, with an initial learning rate 2×1042\times10^{-4}, cosine-annealed to 1×1071\times10^{-7} over 30 epochs, on 2×Tesla V100 GPUs (batch size 8). Model selection per fold is based on the sum of validation AUC and F1 score (Jin et al., 26 Oct 2025).

5. Quantitative Results and Comparative Evaluation

GateFuseNet demonstrates superior performance compared to prominent multimodal 3D classifiers:

Model Accuracy (%) AUC Precision (%) Recall (%) F1 (%) Specificity (%) AUPR
GateFuseNet 85.00 0.9206 84.98 86.06 85.48 83.87 0.9227
ResNeXt 76.56 0.8594
AG_SE_ResNeXt 76.88 0.8831
DenseFormer-MoE 81.25 0.9084

GateFuseNet yields a +8.44% increase in accuracy and +6.12% in AUC compared to ResNeXt, and leads DenseFormer-MoE by +3.75% accuracy and +1.22% AUC. In ablation studies, the proposed gated fusion mechanism outperforms both weighted-average (76.68% accuracy, 0.8642 AUC) and simple concatenation (78.17%, 0.8823 AUC). Further, ROI-anchored fusion confers clear advantages: fusing within the ROI branch (default) attains 85.00% accuracy (0.9206 AUC) compared to T1 branch fusion (77.06%, 0.8761 AUC) or QSM branch fusion (78.43%, 0.8920 AUC) (Jin et al., 26 Oct 2025).

6. Model Interpretability and Clinical Plausibility

Qualitative evaluation via Grad-CAM reveals that GateFuseNet’s activations during inference are concentrated in the globus pallidus and substantia nigra, hallmark loci of iron accumulation and dopaminergic neuron loss in PD. The Grad-CAM-derived heatmaps align with the input ROI masks, substantiating that the network’s decision-making leverages anatomically valid, disease-relevant regions as opposed to trivial or spurious features. This interpretability is an explicit consequence of the ROI-guided residual gating pathway and the architectural coupling between attention-driven fusion and anatomical priors (Jin et al., 26 Oct 2025).

7. Conclusion and Data Accessibility

GateFuseNet establishes a robust adaptive pipeline for fusing QSM and T1-weighted MRI with explicit DGM anatomical priors through a hierarchical gated fusion paradigm. Its combination of a 3D backbone, GF blocks, ROI anchoring, and focal loss optimization yields state-of-the-art performance for PD discrimination (85.00% accuracy, 0.9206 AUC) while ensuring the transparency of diagnostic logic through clinically consistent attention mapping. Source codes and pretrained models are available at https://github.com/YangGaoUQ/GateFuseNet (Jin et al., 26 Oct 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to GateFuseNet.