Residual and Plain Convolutional Neural Networks for 3D Brain MRI Classification (1701.06643v1)

Published 23 Jan 2017 in cs.CV

Abstract: In the recent years there have been a number of studies that applied deep learning algorithms to neuroimaging data. Pipelines used in those studies mostly require multiple processing steps for feature extraction, although modern advancements in deep learning for image classification can provide a powerful framework for automatic feature generation and more straightforward analysis. In this paper, we show how similar performance can be achieved skipping these feature extraction steps with the residual and plain 3D convolutional neural network architectures. We demonstrate the performance of the proposed approach for classification of Alzheimer's disease versus mild cognitive impairment and normal controls on the Alzheimer's Disease National Initiative (ADNI) dataset of 3D structural MRI brain scans.

Authors (4)

Sergey Korolev (3 papers)
Amir Safiullin (1 paper)
Mikhail Belyaev (44 papers)
Yulia Dodonova (3 papers)

Citations (349)

View on Semantic Scholar

Summary

Residual and Plain Convolutional Neural Networks for 3D Brain MRI Classification

The paper "Residual and Plain Convolutional Neural Networks for 3D Brain MRI Classification" demonstrates the potential of using neural network architectures—specifically residual and plain convolutional neural networks (CNNs)—for biomedical image classification tasks. The authors address the intricate nature of feature extraction from MRI data, utilizing deep learning to eschew traditional preprocessing methods that typically precede classification. The paper focuses on the challenging problem of diagnosing Alzheimer’s Disease (AD) using MRI scans, comparing AD against mild cognitive impairment (MCI) and normal controls, leveraging 3D data from the Alzheimer's Disease Neuroimaging Initiative (ADNI).

Proposed Architectures

The authors propose two CNN architectures—VoxCNN, a variation of the VGG architecture adapted for volumetric, or 3D, data, and a ResNet architecture inspired by VoxResNet. These models are designed to handle the spatiotemporal complexity inherent in 3D MRI data, applying convolutional, pooling, and batch normalization mechanisms to facilitate learning from reduced dataset sizes typical in neuroimaging.

VoxCNN Architecture: This setup includes four convolutional blocks with increasing filter counts, culminating in layers optimized for feature deconvolution via batch normalization and dropout.

ResNet Architecture: Building upon the success of the ResNet model in 2D image recognition, the ResNet model here integrates identity mappings with deep layers, enhancing convergence speed and optimization. The architecture extends past simple identity connections to deeper networks, thus facilitating improved feature differentiation through residual learning.

Methodology

The authors conducted experiments using a subset of ADNI data, particularly "Spatially Normalized, Masked and N3 corrected T1 images," accounting for possible information leaks by selecting only the first image per subject. The dataset includes 231 images evenly distributed across four classes: AD, LMCI, EMCI, and NC, with six binary classification tasks analyzed.

Both architectures—VoxCNN and ResNet—are evaluated using cross-validation techniques to measure and ensure predictive stability across folds. Modifying batch process iterations ensure each batch contains representative samples from all classes, thus maintaining model learning stability.

Results

The results presented show that VoxCNN and ResNet provide competitive classification performance for AD versus NC, achieving ROC AUC values near 0.88 and 0.87 respectively, with classification accuracies around 79% and 80%. However, results for separating AD from LMCI and EMCI were less satisfactory, with both architectures displaying limitations in generalization beyond simpler binary distinctions.

An additional network-attention analysis projects the importance of specific brain regions during classification, indirectly cross-validating the medical understanding of brain areas affected by Alzheimer's Disease.

Conclusions and Future Directions

This paper advances neuroimaging classification by proposing network architectures that forgo the labor-intensive requirements of feature extraction steps in favor of an all-encompassing learning algorithm capable of directly interpreting 3D MRI data. Their efficacy lies in simplifying the classification pipeline, thereby potentially expediting clinical diagnostic processes.

Future work is speculated to focus on generalizing these methodologies to handle raw MRI imagery without extensive preprocessing, such as aligning or skull stripping. This would enhance the practical applicability of these models, ensuring robustness and adaptability to varying data processing requirements in clinical settings. The exploration of alternative data augmentation strategies and enhanced network designs will also be crucial to improving model efficacy across more complex or less distinct neurodegenerative spectrums.

PDF Markdown