Comparing Different Deep Learning Architectures for Classification of Chest Radiographs (2002.08991v1)

Published 20 Feb 2020 in cs.LG, cs.CV, and eess.IV

Abstract: Chest radiographs are among the most frequently acquired images in radiology and are often the subject of computer vision research. However, most of the models used to classify chest radiographs are derived from openly available deep neural networks, trained on large image-datasets. These datasets routinely differ from chest radiographs in that they are mostly color images and contain several possible image classes, while radiographs are greyscale images and often only contain fewer image classes. Therefore, very deep neural networks, which can represent more complex relationships in image-features, might not be required for the comparatively simpler task of classifying grayscale chest radiographs. We compared fifteen different architectures of artificial neural networks regarding training-time and performance on the openly available CheXpert dataset to identify the most suitable models for deep learning tasks on chest radiographs. We could show, that smaller networks such as ResNet-34, AlexNet or VGG-16 have the potential to classify chest radiographs as precisely as deeper neural networks such as DenseNet-201 or ResNet-151, while being less computationally demanding.

Authors (6)

Keno K. Bressem (13 papers)
Lisa Adams (3 papers)
Christoph Erxleben (1 paper)
Bernd Hamm (2 papers)
Stefan Niehues (1 paper)
Janis Vahldiek (1 paper)

Citations (161)

View on Semantic Scholar

Summary

Comparative Evaluation of Deep Learning Architectures for Chest Radiograph Classification

The paper "Comparing Different Deep Learning Architectures for Classification of Chest Radiographs" by Bressem et al. investigates the computational efficiency and predictive performance of several deep learning models in the specific task of classifying chest radiographs, leveraging the CheXpert dataset. The authors aim to ascertain whether smaller and shallower architectures can match or even outperform the commonly employed deeper models, notably within the context of radiological imaging, which traditionally involves greyscale images and a limited set of classes compared to datasets like ImageNet.

Overview of Methodology

The authors employ a suite of fifteen convolutional neural network (CNN) architectures, including ResNet, DenseNet, VGG, SqueezeNet, and AlexNet, training them on the standardized, publicly available CheXpert dataset. This dataset comprises 224,316 chest radiographs annotated for fourteen pathologies. The training protocol involves multiple image transformations and resolution adjustments, focusing particularly on five clinical labels: cardiomegaly, edema, consolidation, atelectasis, and pleural effusion.

Training procedures are executed employing the FastAI and PyTorch libraries across a robust GPU configuration. Performance evaluations utilize AuROC and AUPRC metrics for their independence from classification thresholds. The substantial computing effort is directed at gauging models under batch size variations, thus comprehensively evaluating the resource-effectiveness of various architectures for this specific task.

Key Findings

Performance Metrics: Shallower networks such as ResNet-34 and VGG-16 can deliver a comparable, if not superior, predictive performance relative to some deeper networks (e.g., DenseNet-161) in both AUROC and AUPRC metrics.
Training Efficiency: Shallower networks prove advantageous due to reduced computational resource requirements and faster training cycles. For example, AlexNet completed training within 20 minutes per session, starkly contrasted against DenseNet architectures which extended to over 5 hours.
Resource Management: Enhanced efficiency implies significant flexibility in hyperparameter tuning and facilitates the integration of higher resolution images necessary for identifying fine-grained details in chest radiographs without compromising computational limits.

Implications

The findings challenge prevailing preferences towards deeper networks, suggesting that in domain-specific applications like radiograph classification, smaller architectures may suffice. This has practical implications for the deployment in resource-constrained environments and for tasks requiring rapid model iterations, such as active learning environments involving a 'human in the loop'.

Future Directions

Future research may delve into optimized hyperparameter settings and advanced training methodologies, such as semi-supervised learning or unsupervised pre-training to further push the performance of these smaller networks. Moreover, investigating the integration of uncertainty labels in model training could address inherent noise in datasets like CheXpert, enriching model robustness and predictive reliability.

In summary, the paper makes a compelling case for reevaluating the efficacy of shallower neural networks in specialized imaging diagnostics. It underscores a paradigm wherein computational expediency and predictive precision for chest radiograph classification may be harmonized, prompting a recalibration of architectural choices in broader AI applications within medical imaging.

Related Papers

Find Related Papers