A scoping review of transfer learning research on medical image analysis using ImageNet

Published 27 Apr 2020 in eess.IV, cs.CV, and cs.LG | (2004.13175v5)

Abstract: Objective: Employing transfer learning (TL) with convolutional neural networks (CNNs), well-trained on non-medical ImageNet dataset, has shown promising results for medical image analysis in recent years. We aimed to conduct a scoping review to identify these studies and summarize their characteristics in terms of the problem description, input, methodology, and outcome. Materials and Methods: To identify relevant studies, MEDLINE, IEEE, and ACM digital library were searched. Two investigators independently reviewed articles to determine eligibility and to extract data according to a study protocol defined a priori. Results: After screening of 8,421 articles, 102 met the inclusion criteria. Of 22 anatomical areas, eye (18%), breast (14%), and brain (12%) were the most commonly studied. Data augmentation was performed in 72% of fine-tuning TL studies versus 15% of the feature-extracting TL studies. Inception models were the most commonly used in breast related studies (50%), while VGGNet was the common in eye (44%), skin (50%) and tooth (57%) studies. AlexNet for brain (42%) and DenseNet for lung studies (38%) were the most frequently used models. Inception models were the most frequently used for studies that analyzed ultrasound (55%), endoscopy (57%), and skeletal system X-rays (57%). VGGNet was the most common for fundus (42%) and optical coherence tomography images (50%). AlexNet was the most frequent model for brain MRIs (36%) and breast X-Rays (50%). 35% of the studies compared their model with other well-trained CNN models and 33% of them provided visualization for interpretation. Discussion: This study identified the most prevalent tracks of implementation in the literature for data preparation, methodology selection and output evaluation for medical image analysis. Also, we identified several critical research gaps existing in the TL studies on medical image analysis.

Abstract PDF Upgrade to Chat

Authors (3)

Citations (320)

View on Semantic Scholar

Summary

The paper reviews 102 studies (2012-2020) applying ImageNet-pretrained CNNs for medical image analysis, noting common sites (eye, breast, brain) and modalities (X-Ray, MRI).
Fine-tuning with data augmentation is favored for larger datasets, while feature extraction suits smaller ones; visualization methods help interpret CNN decisions.
Findings guide future research by highlighting suitable CNN architectures for specific image types, suggesting gaps in benchmarking and data augmentation, and emphasizing model interpretability.

A Review of Transfer Learning Applications in Medical Image Analysis Using ImageNet

The paper "A Scoping Review of Transfer Learning Research on Medical Image Analysis Using ImageNet" by Mohammad Amin Morid et al. presents a comprehensive overview of the integration of transfer learning (TL) methodologies with convolutional neural networks (CNNs) in the domain of medical image analysis. The scoping review aims to provide insights into study characteristics such as input data, CNN models employed, the transfer of parameters, and performance measures in this field. By focusing primarily on the application of CNNs pre-trained on non-medical ImageNet data, the paper highlights the current trends and prevalent methodologies in medical image classification tasks.

Key Findings

The authors reviewed 102 studies spanning diverse anatomical sites, imaging modalities, and CNN architectures between 2012 and 2020. The reviewed studies underscore the dominance of certain CNN models and imaging modalities in the landscape of medical image analysis. The most frequently studied anatomical sites include the eye, breast, and brain, which account for 18%, 14%, and 12% of studies, respectively. X-Ray and MRI were the most widely used imaging modalities. Among CNN architectures, models like Inception-V3, VGG-16, AlexNet, and ResNet-50 were utilized most frequently.

Data augmentation emerged as a prevalent strategy in fine-tuning TL studies, applied in 72% of cases, compared to only 15% in feature-extraction approaches. The paper also identifies a shift towards binary classification in medical image analysis, accounting for 71% of the reviewed studies.

Methodological Insights

Transfer learning approaches are categorized into feature-extraction and fine-tuning models. Fine-tuning was favored in studies with larger datasets, underscoring the importance of sufficient data volume for effective model training. Feature-extraction approaches were preferred with smaller datasets to leverage pre-trained models' knowledge effectively.

The review reveals that visualization methods like heatmaps, deconvolution, and activation maximization were employed in 33% of studies to interpret CNN models. Such visualization strategies are crucial for understanding and validating the critical features used in diagnostic prediction by CNNs.

Implications and Future Directions

From a practical perspective, the findings could guide future research directions in selecting the most suitable CNN architectures and transfer learning techniques for specific medical image analysis tasks. The prominence of wide networks for ultrasound, endoscopic, and skeletal system X-ray images suggests their suitability for handling specific image features. Meanwhile, shallow networks with small kernels seemed optimal for eye, skin, and dental images, capturing detailed textural changes accurately.

The theoretical implications of this research emphasize the potential for improved model interpretability through advanced visualization methods. Visualization offers opportunities for embedding domain knowledge into AI models, thus increasing the trust of medical professionals in algorithmic predictions.

The authors identify significant gaps in the existing literature, such as insufficient benchmarking across models, limited exploration of deep networks in less frequent imaging modalities, and inadequate assessment of dataset size thresholds. Future research should address these gaps by engaging in rigorous benchmarking studies, investigating generative methods for data augmentation, and conducting detailed visualization analyses to foster model interpretability.

Overall, the paper provides a substantive foundation for understanding the current landscape of transfer learning applications in medical image diagnostics and charts prospective paths for refining and augmenting the utility of CNNs in this critical field.

Markdown Report Issue