Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BCN20000: Dermoscopic Lesions in the Wild (1908.02288v2)

Published 6 Aug 2019 in eess.IV and cs.CV

Abstract: This article summarizes the BCN20000 dataset, composed of 19424 dermoscopic images of skin lesions captured from 2010 to 2016 in the facilities of the Hospital Cl\'inic in Barcelona. With this dataset, we aim to study the problem of unconstrained classification of dermoscopic images of skin cancer, including lesions found in hard-to-diagnose locations (nails and mucosa), large lesions which do not fit in the aperture of the dermoscopy device, and hypo-pigmented lesions. The BCN20000 will be provided to the participants of the ISIC Challenge 2019, where they will be asked to train algorithms to classify dermoscopic images of skin cancer automatically.

Citations (377)

Summary

  • The paper introduces BCN20000, a dataset of 19,424 high-quality dermoscopic images aimed at enhancing automated skin lesion classification.
  • It details a rigorous curation process from the Hospital Clínic de Barcelona, capturing diverse and challenging lesion types including atypical regions and large, hypo-pigmented lesions.
  • The dataset supports ISIC challenge participation and future research, promoting the improvement of deep learning models for robust dermatological diagnostics.

An Overview of the BCN20000 Dataset for Dermoscopic Image Classification

The paper "BCN20000: Dermoscopic Lesions in the Wild" presents a meticulously curated dataset designed to advance the field of automated dermoscopic image classification for skin cancer diagnosis. It provides an in-depth overview of the BCN20000 dataset, which comprises 19,424 high-quality dermoscopic images, each representing various skin lesions that were collected from 2010 to 2016 at the Hospital Clínic in Barcelona.

Purpose and Scope

The primary aim of the BCN20000 dataset is to tackle the challenge of unconstrained classification of dermoscopic images, a critical task in dermatological diagnostics. The dataset focuses on difficult-to-diagnose skin conditions, encompassing lesions located in atypical regions such as nails and mucosa, large lesions that exceed the field of view of dermoscopic devices, and hypo-pigmented lesions. This diversity of lesion types and challenges represents the complex scenarios faced by dermatologists in clinical practice.

Methodological Approach

The creation of the BCN20000 dataset involved an extensive data collection and curation process. Over 16 years, the dermatology department at the Hospital Clínic de Barcelona systematically amassed dermoscopic images using high-resolution cameras equipped with dermoscopic attachments. For this dataset, images taken between 2010 and 2016 were meticulously organized, filtered through computer vision algorithms, linked to diagnostic data, and checked for diagnostic plausibility by multiple expert readers. The dataset provides a comprehensive spectrum of 5,583 skin lesions with rigorous institutional ethics approval, ensuring both scientific robustness and ethical compliance.

Dataset Characteristics and Usage

The dataset's images are categorized into several significant dermatological conditions, including nevus, melanoma, basal cell carcinoma, seborrheic keratosis, and more. Each image is supplemented with metadata pertinent to the lesion's anatomical location and the patient's demographics—age and sex. Such detailed contextual information enhances the potential for developing algorithms that accurately mimic the diagnostic thought process of dermatologists.

BCN20000 contributes to the ISIC 2019 Challenge, where researchers are tasked with developing algorithms for classifying a myriad of diagnostic categories. Moreover, participants are encouraged to design systems capable of recognizing out-of-distribution scenarios, thereby improving the algorithm's reliability and generalization to unseen data. Additionally, the dataset is accessible through the ISIC Archive, establishing it as a resource for ongoing research and algorithm development.

Implications for Future Research

The introduction of the BCN20000 dataset is a significant advancement in the domain of automated skin cancer diagnostics. By providing dermoscopic images that span a wide range of challenge areas encountered in real-world settings, the dataset lays a foundation for breakthroughs in algorithm robustness and accuracy. Future research can leverage this dataset to refine convolutional neural network architectures and transfer learning approaches that are pivotal in handling complex image data. Moreover, the dataset allows for exploration into more advanced machine learning techniques that can further bridge the gap between human expert performance and machine automation in dermatology.

As dermoscopic images become increasingly prevalent in clinical settings, the development of accurate and reliable automated classification systems is paramount. The BCN20000 dataset offers a crucial platform for driving such innovations, ultimately aiming to enhance diagnostic accuracy and improve patient outcomes in dermatology.