Non-Melanoma Skin Cancer Segmentation Dataset

Updated 14 December 2025

The dataset is defined as a comprehensive collection of whole-slide images with pixel-level annotations for non-melanoma skin cancers, enabling semantic segmentation research.
It employs multi-scale patch extraction at 2×, 5×, and 10× magnifications, facilitating detailed tissue analysis and robust algorithm benchmarking using metrics like Dice score and mean IoU.
Widely used in computational pathology, it supports the development of CNN, GNN, and context-aware neural architectures for improved tissue delineation and automated histological analysis.

The Histopathology Non-Melanoma Skin Cancer Segmentation Dataset represents a pivotal resource for quantitative tissue analysis and algorithm benchmarking in computational pathology. Conceived for the semantic segmentation of histology images containing non-melanoma skin cancers, it enables the development and standardized assessment of automated methods for tissue-type delineation. The resource includes pixel-level annotations across multiple cancer types, supports patch-based machine learning workflows, and is widely referenced for designing neural architectures oriented toward spatial and biological context modeling. Below, the key aspects of the dataset, its composition, technical protocols, evaluation standards, and downstream uses are elaborated.

1. Provenance, Scope, and Intended Use

The dataset, formally named "Histopathology Non-Melanoma Skin Cancer Segmentation Dataset" (Thomas et al., University of Queensland, 2021; DOI: 10.14264/8be4bd0), was released by S. Thomas, N. Hamilton, and S. Thomas. It is intended for semantic segmentation tasks within histological images of non-melanoma skin cancers and facilitates automated tissue-type annotation for research and algorithmic benchmarking (Venkatraman et al., 7 Dec 2025). Covered cancer types include Basal Cell Carcinoma (BCC), Squamous Cell Carcinoma (SCC), and Intraepithelial Carcinoma (IEC).

The dataset offers whole-slide images (WSIs) sourced from clinical specimens, designed to support both conventional convolutional approaches and advanced tissue-relational architectures. Its establishment fills a previous gap for standardized and openly accessible skin cancer histopathology segmentation benchmarks, enabling reproducible model evaluation and development.

2. Dataset Composition: Image Counts, Magnifications, and Patch Extraction

The dataset comprises 290 WSIs:

140 BCC
60 SCC
90 IEC

WSIs are processed through overlapping patch extraction at pixel dimensions of 256 × 256, drawn at multiple magnifications: 2×, 5×, and 10×. The 10× magnification is the principal analysis level in published experiments, selected for its optimal balance between spatial resolution and computational overhead.

Patch extraction supports dense sampling for machine learning applications, where tissue variability and context are maximized by multiple scales and substantial overlap between adjacent tiles.

3. Annotation Protocols and Labeling

No explicit annotation protocol details appear in the cited manuscript (Venkatraman et al., 7 Dec 2025). The dataset's documentation does not specify:

The number or expertise of annotators (e.g., board-certified pathologists)
Annotation software or tooling
Quality-control procedures (such as consensus grading or reconciliatory review)
Inter-annotator agreement measures

Thus, while the dataset is widely employed, its annotation reliability and workflow transparency can only be inferred from its usage as an established benchmark in published segmentation research. A plausible implication is that downstream comparisons must interpret results assuming underlying quality is consistent with standard digital pathology practices unless otherwise stated.

4. Class Definitions and Tissue-Type Hierarchies

Pixel-wise labels are assigned across $K$ tissue-type classes. Representative annotated classes include:

BKG (Background)
KER (Keratin)
INF (Inflammation)
RET (Reticular Dermis)
HYP (Hypodermis)
FOL (Hair Follicles)
Tumor compartments (epithelial basal layers in BCC/SCC/IEC)

Annotation granularity is restricted to single-class pixel assignments; no high-level hierarchy or structural grouping (e.g. tumor/stroma/epidermis) is documented. Each pixel is uniquely assigned a class label, supporting multi-class semantic segmentation model training.

5. Dataset Splits, Access, and Licensing

All benchmarked models in major studies are retrained with identical data splits and training schedules for fairness, though specific split counts (training, validation, test) and stratification rules are not reported (Venkatraman et al., 7 Dec 2025). The dataset is disseminated via the University of Queensland data repository (https://doi.org/10.14264/8be4bd0). The licensing regime is not detailed in the referenced manuscript, so users must consult the dataset’s repository for restrictions governing academic, non-commercial, or clinical use.

6. Benchmark Evaluation Metrics and Results

Core evaluation metrics supplied with the dataset and in published benchmarks include:

Pixel-wise Accuracy: Fraction of correctly classified pixels.
Mean Intersection-over-Union (IoU): For $K$ classes, $mIoU = (1/K) \sum_{c=1}^K \frac{TP_c}{TP_c + FP_c + FN_c}$ .
Dice Similarity Coefficient: Commonly called Dice score, assessing overlap quality.

The modern NTRM architecture (Venkatraman et al., 7 Dec 2025) introduces a composite cross-entropy loss function: $\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{ce}}(\hat y_{\text{final}}, y) + \lambda\,\mathcal{L}_{\text{ce}}(\hat y_{\text{init}}, y),\quad \lambda=0.4$ with pixel-wise class weighting: $\mathcal{L}_{\text{ce}}(\hat y, y) = -\frac{1}{N}\sum_{n=1}^N\sum_{c=1}^K w_c\, y_{n,c}\,\log(\hat y_{n,c})$ where $N$ is the pixel count, $K$ is the number of classes, $w_c$ is the dynamic class weight, and $\lambda$ balances final versus initial segmentation outputs.

Quantitative performance comparison (at 10× magnification):

Method	Accuracy	Mean IoU	Dice
DeepLabV3+	0.5061	0.4191	0.5038
UNet_VGG	0.6708	0.6002	0.7051
Attention UNet	0.7326	0.6326	0.7438
UNet_ResNet	0.7368	0.6763	0.7674
MiT	0.8310	0.6530	—
NTRM	0.8106	0.7288	0.8163

NTRM, when benchmarked, achieves a Dice coefficient 4.9%–31.25% higher than preceding architectures, substantiating its relational modeling design for context-aware segmentation.

7. Downstream Applications, Standardization, and Extensions

The dataset is central to the development and objective benchmarking of tissue segmentation algorithms, especially those integrating contextual tissue relationships. It underpins:

Training and evaluation of CNN and GNN-based segmenters
Transfer learning for related histopathology tasks
Morphometric and feature-based tumor microenvironment analysis

It also informs resource design for further large-scale annotation initiatives, such as the Histo-Miner NucSeg and TumSeg datasets for cSCC (Sancéré et al., 7 May 2025), which offer expert-curated nucleus and tumor region annotations with additional granularity and feature extraction capabilities. The Queensland dataset’s whole-slide, multi-cancer coverage and public availability establish it as a reference point for tasks involving tissue delineation, boundary detection, and biomarker discovery in skin oncology.

A plausible implication is that as relational graph analysis and context-aware deep learning become more prominent, future datasets may expand annotation granularity, include hierarchical and interaction labels, and formalize annotation protocols to further support reproducibility and clinical translation.