- The paper reviews 102 studies (2012-2020) applying ImageNet-pretrained CNNs for medical image analysis, noting common sites (eye, breast, brain) and modalities (X-Ray, MRI).
- Fine-tuning with data augmentation is favored for larger datasets, while feature extraction suits smaller ones; visualization methods help interpret CNN decisions.
- Findings guide future research by highlighting suitable CNN architectures for specific image types, suggesting gaps in benchmarking and data augmentation, and emphasizing model interpretability.
A Review of Transfer Learning Applications in Medical Image Analysis Using ImageNet
The paper "A Scoping Review of Transfer Learning Research on Medical Image Analysis Using ImageNet" by Mohammad Amin Morid et al. presents a comprehensive overview of the integration of transfer learning (TL) methodologies with convolutional neural networks (CNNs) in the domain of medical image analysis. The scoping review aims to provide insights into study characteristics such as input data, CNN models employed, the transfer of parameters, and performance measures in this field. By focusing primarily on the application of CNNs pre-trained on non-medical ImageNet data, the paper highlights the current trends and prevalent methodologies in medical image classification tasks.
Key Findings
The authors reviewed 102 studies spanning diverse anatomical sites, imaging modalities, and CNN architectures between 2012 and 2020. The reviewed studies underscore the dominance of certain CNN models and imaging modalities in the landscape of medical image analysis. The most frequently studied anatomical sites include the eye, breast, and brain, which account for 18%, 14%, and 12% of studies, respectively. X-Ray and MRI were the most widely used imaging modalities. Among CNN architectures, models like Inception-V3, VGG-16, AlexNet, and ResNet-50 were utilized most frequently.
Data augmentation emerged as a prevalent strategy in fine-tuning TL studies, applied in 72% of cases, compared to only 15% in feature-extraction approaches. The paper also identifies a shift towards binary classification in medical image analysis, accounting for 71% of the reviewed studies.
Methodological Insights
Transfer learning approaches are categorized into feature-extraction and fine-tuning models. Fine-tuning was favored in studies with larger datasets, underscoring the importance of sufficient data volume for effective model training. Feature-extraction approaches were preferred with smaller datasets to leverage pre-trained models' knowledge effectively.
The review reveals that visualization methods like heatmaps, deconvolution, and activation maximization were employed in 33% of studies to interpret CNN models. Such visualization strategies are crucial for understanding and validating the critical features used in diagnostic prediction by CNNs.
Implications and Future Directions
From a practical perspective, the findings could guide future research directions in selecting the most suitable CNN architectures and transfer learning techniques for specific medical image analysis tasks. The prominence of wide networks for ultrasound, endoscopic, and skeletal system X-ray images suggests their suitability for handling specific image features. Meanwhile, shallow networks with small kernels seemed optimal for eye, skin, and dental images, capturing detailed textural changes accurately.
The theoretical implications of this research emphasize the potential for improved model interpretability through advanced visualization methods. Visualization offers opportunities for embedding domain knowledge into AI models, thus increasing the trust of medical professionals in algorithmic predictions.
The authors identify significant gaps in the existing literature, such as insufficient benchmarking across models, limited exploration of deep networks in less frequent imaging modalities, and inadequate assessment of dataset size thresholds. Future research should address these gaps by engaging in rigorous benchmarking studies, investigating generative methods for data augmentation, and conducting detailed visualization analyses to foster model interpretability.
Overall, the paper provides a substantive foundation for understanding the current landscape of transfer learning applications in medical image diagnostics and charts prospective paths for refining and augmenting the utility of CNNs in this critical field.