Deep Learning Ensembles for Melanoma Recognition in Dermoscopy Images (1610.04662v2)

Published 14 Oct 2016 in cs.CV

Abstract: Melanoma is the deadliest form of skin cancer. While curable with early detection, only highly trained specialists are capable of accurately recognizing the disease. As expertise is in limited supply, automated systems capable of identifying disease could save lives, reduce unnecessary biopsies, and reduce costs. Toward this goal, we propose a system that combines recent developments in deep learning with established machine learning approaches, creating ensembles of methods that are capable of segmenting skin lesions, as well as analyzing the detected area and surrounding tissue for melanoma detection. The system is evaluated using the largest publicly available benchmark dataset of dermoscopic images, containing 900 training and 379 testing images. New state-of-the-art performance levels are demonstrated, leading to an improvement in the area under receiver operating characteristic curve of 7.5% (0.843 vs. 0.783), in average precision of 4% (0.649 vs. 0.624), and in specificity measured at the clinically relevant 95% sensitivity operating point 2.9 times higher than the previous state-of-the-art (36.8% specificity compared to 12.5%). Compared to the average of 8 expert dermatologists on a subset of 100 test images, the proposed system produces a higher accuracy (76% vs. 70.5%), and specificity (62% vs. 59%) evaluated at an equivalent sensitivity (82%).

Authors (7)

Noel Codella (21 papers)
Quoc-Bao Nguyen (2 papers)
Sharath Pankanti (5 papers)
David Gutman (7 papers)
Brian Helba (5 papers)
Allan Halpern (6 papers)
John R. Smith (12 papers)

Citations (505)

View on Semantic Scholar

Summary

Deep Learning Ensembles for Melanoma Recognition in Dermoscopy Images

The paper "Deep Learning Ensembles for Melanoma Recognition in Dermoscopy Images" by Codella et al. presents a novel system leveraging deep learning and machine learning ensembles to improve the detection of melanoma in dermoscopic images. Melanoma, being extremely deadly yet curable if detected early, necessitates automation due to the global shortage of expert dermatologists.

Methodology

The proposed system integrates traditional hand-coded feature extractors, sparse-coding methods, and SVMs with modern deep learning techniques, such as Deep Residual Networks and fully convolutional networks. The ensemble is designed for two primary tasks: lesion segmentation and melanoma classification. This paper evaluates the method using the ISIC dataset, a significant publicly available resource containing 900 training and 379 testing images.

Results

The system achieves new state-of-the-art performance on multiple metrics:

AUROC Improvement: A 7.5% increase (0.843 vs. 0.783) compared to prior work.
Average Precision: Enhanced by 4% (0.649 vs. 0.624).
Specificity at 95% Sensitivity: Increased 2.9 times, from 12.5% to 36.8%.

These advancements illustrate that the fusion of diverse machine learning approaches yields superior outcomes compared to singular strategies. The segmentation component, crucial for context-specific analysis, demonstrated human-level performance, ensuring its efficacy in supporting subsequent classification phases.

Comparison with Dermatologists

A critical aspect of this research is the system's performance relative to dermatologists. On a subset of 100 images, the automated system outperformed eight dermatologists with respect to accuracy (76% vs. 70.5%) and specificity (62% vs. 59%) at comparable sensitivity levels.

Discussion

The paper underscores several insights:

Evaluation Metric Considerations: The authors advocate reevaluating the metrics used to assess clinical effectiveness, as average precision alone does not fully capture system performance.
Value of Ensembles: Multi-technique ensembles, combining deep learning with traditional computer vision, demonstrate notable improvements over deep learning alone.
Data Augmentation: The application of dynamic, non-linear data augmentation significantly enables the training of large neural networks on smaller datasets.
Transfer Learning: Utilizing deep networks pre-trained on non-medical datasets (e.g., ImageNet) offers valuable additional features for medical image classification, corroborating prior findings.

Future Work

Future exploration might focus on expanding the ISIC dataset to include more dermoscopic patterns, enhancing classification models with semantic descriptors, and employing complex ensemble learning methods as datasets grow. Integrating patient metadata and contextual lesion analysis could further augment system decision-making processes.

In conclusion, the research provides critical advancements in melanoma detection via automated systems, setting a foundation for potentially life-saving technologies in areas with limited access to dermatologists. The paper's use of diverse machine learning techniques within an ensemble framework demonstrates the promising potential of combining classical and contemporary AI methods for medical image analysis.

PDF Markdown