Multi-modal multi-class Parkinson disease classification using CNN and decision level fusion (2307.02978v1)

Published 6 Jul 2023 in cs.CV

Abstract: Parkinson disease is the second most common neurodegenerative disorder, as reported by the World Health Organization. In this paper, we propose a direct three-Class PD classification using two different modalities, namely, MRI and DTI. The three classes used for classification are PD, Scans Without Evidence of Dopamine Deficit and Healthy Control. We use white matter and gray matter from the MRI and fractional anisotropy and mean diffusivity from the DTI to achieve our goal. We train four separate CNNs on the above four types of data. At the decision level, the outputs of the four CNN models are fused with an optimal weighted average fusion technique. We achieve an accuracy of 95.53 percentage for the direct three class classification of PD, HC and SWEDD on the publicly available PPMI database. Extensive comparisons including a series of ablation studies clearly demonstrate the effectiveness of our proposed solution.

Summary

The paper introduces a novel CNN-based framework that employs decision-level optimal weighted average fusion for direct three-class Parkinson’s disease classification.
It harnesses multi-modal neuroimaging data, combining MRI white/gray matter analysis with DTI indices such as FA and MD to boost diagnostic accuracy.
The approach, validated on the PPMI dataset, outperforms several state-of-the-art methods as shown in comparative ablation studies.

The paper addresses the problem of direct three-class Parkinson's Disease (PD) classification using Convolutional Neural Networks (CNNs) and decision-level fusion. The classification distinguishes between PD, Scans Without Evidence of Dopamine Deficit (SWEDD), and Healthy Control (HC) groups, utilizing multi-modal data from Magnetic Resonance Imaging (MRI) and Diffusion Tensor Imaging (DTI). The paper employs white matter (WM) and gray matter (GM) data from MRI, along with fractional anisotropy (FA) and mean diffusivity (MD) from DTI.

The methodology involves training four separate CNNs on the aforementioned data types. The outputs of these models are then fused at the decision level using an Optimal Weighted Average Fusion (OWAF) technique. The approach was validated using the publicly available Parkinson's Progression Markers Initiative (PPMI) database.

The paper highlights the following contributions:

Addresses a direct three-class classification task (PD, HC, and SWEDD).
Analyzes multi-modal neuroimaging data, specifically T1-weighted MRI and DTI.
Fuses the outputs of each CNN model at the decision level using an OWAF strategy.

The solution pipeline consists of four CNN networks, each producing a $3 \times 1$ probability vector indicating the likelihood of the data belonging to one of the three classes. These probability vectors are then combined using the OWAF technique.

Voxel-based morphometry (VBM) is used to prepare MRI data, preprocessed using SPM-12 tools, with normalization achieved through diffeomorphic anatomical registration with exponentiated lie algebra (DARTEL). This process segments the MRI data into GM, WM, and cerebrospinal fluid, also providing anatomical normalization to a stereotactic space. The PPMI database provides DTI indices, including FA and MD. The equations used to define MD and FA are:

$MD=\frac{\lambda_1+\lambda_2 +\lambda_3}{3}=\frac{D_{xx}+D_{yy}+D_{zz}}{3}$

$MD$ : Mean Diffusivity
$\lambda_1, \lambda_2, \lambda_3$ : Eigenvalues of the diffusion tensor
$D_{xx}, D_{yy}, D_{zz}$ : Diagonal terms of the diffusion tensor

$FA=\sqrt{\frac{1}{2}\sqrt{\frac{(\lambda_1-\lambda_2)^2+{(\lambda_2-\lambda_3)}^2+{(\lambda_3-\lambda_1)}^2}{{\lambda_1}^2+{\lambda_2}^2+{\lambda_3}^2}}$

$FA$ : Fractional Anisotropy
$\lambda_1, \lambda_2, \lambda_3$ : Eigenvalues of the diffusion tensor

The dataset is imbalanced across the three classes, and the number of training samples is small. The Adaptive Synthetic (ADASYN) oversampling method is therefore used to augment the minority classes. The primary idea behind ADASYN is to compute the weighted distribution of minority samples based on a wide range of out-of-elegance neighbors. The number of synthetic minority data to be created is given by:

$G=(m_{maj} - m_{min}) \times \beta$

$G$ : Number of synthetic minority data samples
$m_{maj}$ : Number of majority class samples
$m_{min}$ : Number of minority class samples
$\beta$ : Balance level of the synthetic samples

The proposed CNN architecture comprises ten convolutional layers and four fully connected (FC) layers. The network processes volumetric input data slice-wise, with each slice sized at $176 \times 176$ pixels. Max-pooling is used in the pooling layers to reduce the image size. The final FC layer and a soft max operation are used for the classification task. The cross-entropy loss function (CELF) is used, which is mathematically expressed as:

$L_{CELF} = - \sum\limits_{i = 1}^{N}\sum\limits_{j = 1}^{K}p(y_{j}|\mathbf{x})\log\widehat{p}\left( {y_{j}|\mathbf{x} \right)$

$L_{CELF}$ : Cross-Entropy Loss Function
$p(y_{j}|\mathbf{x})$ : Original class label distribution
$\widehat{p}(y_{j}|\mathbf{x})$ : Predicted label distribution from the CNN network.
$N$ : Number of models
$K$ : Number of classes

The weights are generated using the modulated rank averaging (MRA) method. The weights in the MRA method are given by:

$w_i=\frac{f_i}{\sum_{i=1}^{N-1}f_i + R_{max}}$

$w_i$ : Weight of the $i^{th}$ model
$f_i$ : Normalizing factor for the $i^{th}$ model
$R_{max}$ : Rank of the model with the highest accuracy

The normalizing factor is calculated based on the rank of the current model and the difference between the accuracy of the current and next model. In the second stage, these weights are optimised using the grid search method. The overall probability of occurrence of the $j^{th}$ class ( $PF_j$ ) as a result of fusion is calculated as:

$PF_j = \sum_{i=1}^{4} w_{i}^{'}\times p_{ij}$

$PF_j$ : Overall probability of occurrence of the $j^{th}$ class
$w_{i}^{'}$ : Optimized weight for the $i^{th}$ CNN model
$p_{ij}$ : Probability output by the $i^{th}$ CNN model for the $j^{th}$ class

The final class is determined as the one with the maximum $PF_j$ value.

The paper included 281 subjects with baseline visits having both DTI and MRI data from PPMI, including 67 HC, 177 PD and 37 SWEDD subjects. For ADASYN, different neighbor counts ( $k$ ) were tested on the training set, and $k = 30$ produced the best results. The network was trained for 100 epochs using the ADAM optimizer and ReLU activation functions, with a learning rate initialized at $1\times10^{-4}$ and a batch size of 32. Accuracy, precision, recall, and F1 score were used to evaluate the classification performance.

The paper includes two ablation studies. The first paper demonstrates the utility of using both MRI and DTI data. The second paper conveys the benefit of OWAF, the proposed fusion strategy. The four CNNs are trained and evaluated on both single and multi-modal data from MRI and DTI. The four CNNs are combined using four different fusion strategies at the decision level: majority voting, model average fusion, modulated rank averaging, and the proposed optimal weighted average fusion (OWAF) based on the grid search approach.

The method was compared with ten state-of-the-art approaches. The results of comparisons are shown in Table 5. Out of the ten methods considered, five are based on ML and the rest five are based on deep learning (DL). Further, in four out of five DL based approaches, only a single modality, namely, MRI is used for classification. Also note that eight of these ten techniques have only addressed a single two-class classification problem between PD and HC and did not consider the challenging SWEDD class at all. The remaining two approaches did consider SWEDD as a third class but have divided the three-class classification problem into multiple binary classes.

PDF Markdown

Multi-modal multi-class Parkinson disease classification using CNN and decision level fusion (2307.02978v1)

Summary

Related Papers