Data Augmentation for Skin Lesion Analysis (1809.01442v1)

Published 5 Sep 2018 in cs.CV

Abstract: Deep learning models show remarkable results in automated skin lesion analysis. However, these models demand considerable amounts of data, while the availability of annotated skin lesion images is often limited. Data augmentation can expand the training dataset by transforming input images. In this work, we investigate the impact of 13 data augmentation scenarios for melanoma classification trained on three CNNs (Inception-v4, ResNet, and DenseNet). Scenarios include traditional color and geometric transforms, and more unusual augmentations such as elastic transforms, random erasing and a novel augmentation that mixes different lesions. We also explore the use of data augmentation at test-time and the impact of data augmentation on various dataset sizes. Our results confirm the importance of data augmentation in both training and testing and show that it can lead to more performance gains than obtaining new images. The best scenario results in an AUC of 0.882 for melanoma classification without using external data, outperforming the top-ranked submission (0.874) for the ISIC Challenge 2017, which was trained with additional data.

Authors (4)

Fábio Perez (7 papers)
Cristina Vasconcelos (9 papers)
Sandra Avila (41 papers)
Eduardo Valle (50 papers)

Citations (162)

View on Semantic Scholar

Summary

Data Augmentation for Skin Lesion Analysis

The research paper titled "Data Augmentation for Skin Lesion Analysis" presents an in-depth investigation into the effects of various data augmentation strategies on convolutional neural networks (CNNs) tasked with melanoma classification. The authors aim to address the limitations in annotated skin lesion datasets by leveraging data augmentation techniques to enhance the training and testing phases of deep learning models.

Research Objectives and Methodology

The paper focuses on three prominent CNN architectures: Inception-v4, ResNet-152, and DenseNet-161. These architectures have shown compelling results in image classification tasks, and this work explores their potential in the field of automated skin lesion analysis. Through an empirical analysis, thirteen augmentation scenarios were evaluated, encompassing standard color and geometric transformations and more novel approaches such as elastic transforms, random erasing, and a newly proposed augmentation method involving the mixing of different skin lesions.

The research also investigates the implications of data augmentation across different dataset sizes, aiming to discern the most effective strategies for limited datasets. Test-time augmentation was another focus, analyzing how augmenting data at this stage impacts generalization error.

Key Findings

The analysis demonstrates that the strategic application of data augmentation during both training and testing phases significantly enhances model performance:

The highest efficacy was observed with a scenario combining geometric and color transformations, obtaining an area under the curve (AUC) of 0.882 with ResNet on the ISIC Challenge 2017 dataset. This performance surpasses that of the top-ranked challenge submission, which used additional data in its training.
Results consistently show that including test-time data augmentation, particularly using 144 image crops, bolstered performance more reliably than traditional single-image prediction.
The findings indicate that augmentation has more pronounced benefits compared to merely increasing dataset size, highlighting its role as a practical tool in maximizing the use of limited data. However, augmentation does not detract from the value of acquiring additional labeled images.

Implications for Future Research

This paper provides substantial evidence supporting data augmentation as an instrumental component in enhancing CNN performance for medical image analysis, particularly when dealing with limited annotated datasets like those for skin lesions. The research suggests that thoughtful combination and implementation of augmentation techniques can yield improvements superior to simply expanding data quantities.

In terms of future exploration, the results suggest that more sophisticated generative methods, such as GAN-based approaches for data augmentation, might present improved performance. Augmentation strategies tailored to specific ROI and considering the inherent characteristics of dermatological images could also potentially lead to further gains.

This research contributes valuable insights that could pave the way for enhanced diagnostic tools in dermatology, leveraging AI's potential to assist in early and accurate melanoma detection. Continued exploration into augmentation's effects on model generalization and interpretability will further define its utility in clinical applications.

PDF Markdown

Related Papers

GitHub

GitHub - fabioperez/skin-data-augmentation: Source code for the paper 'Data Augmentation for Skin Lesion Analysis' — 🏆 Best Paper Award at the ISIC Skin Image Analysis Workshop @ MICCAI 2018 (85 stars)