Predicting breast tumor proliferation from whole-slide images: the TUPAC16 challenge (1807.08284v2)

Published 22 Jul 2018 in cs.CV

Abstract: Tumor proliferation is an important biomarker indicative of the prognosis of breast cancer patients. Assessment of tumor proliferation in a clinical setting is highly subjective and labor-intensive task. Previous efforts to automate tumor proliferation assessment by image analysis only focused on mitosis detection in predefined tumor regions. However, in a real-world scenario, automatic mitosis detection should be performed in whole-slide images (WSIs) and an automatic method should be able to produce a tumor proliferation score given a WSI as input. To address this, we organized the TUmor Proliferation Assessment Challenge 2016 (TUPAC16) on prediction of tumor proliferation scores from WSIs. The challenge dataset consisted of 500 training and 321 testing breast cancer histopathology WSIs. In order to ensure fair and independent evaluation, only the ground truth for the training dataset was provided to the challenge participants. The first task of the challenge was to predict mitotic scores, i.e., to reproduce the manual method of assessing tumor proliferation by a pathologist. The second task was to predict the gene expression based PAM50 proliferation scores from the WSI. The best performing automatic method for the first task achieved a quadratic-weighted Cohen's kappa score of $\kappa$ = 0.567, 95% CI [0.464, 0.671] between the predicted scores and the ground truth. For the second task, the predictions of the top method had a Spearman's correlation coefficient of r = 0.617, 95% CI [0.581 0.651] with the ground truth. This was the first study that investigated tumor proliferation assessment from WSIs. The achieved results are promising given the difficulty of the tasks and weakly-labelled nature of the ground truth. However, further research is needed to improve the practical utility of image analysis methods for this task.

Citations (243)

View on Semantic Scholar

Summary

The paper demonstrates that deep learning techniques can automate tumor proliferation scoring from whole-slide images, addressing the limitations of manual assessments.
It employs diverse CNN-based methods including ROI identification and ensemble strategies, achieving key metrics like a quadratic weighted kappa of 0.567 and a Spearman correlation of 0.710.
The study highlights the potential of integrating imaging and genomic data to enhance prognostication, paving the way for improved diagnostic workflows in breast cancer.

Tumor Proliferation Prediction from Whole-Slide Images: Insights from the TUPAC16 Challenge

The 2016 TUmor Proliferation Assessment Challenge (TUPAC16) marks a significant step towards the automation of tumor proliferation scoring from whole-slide images (WSIs) in breast cancer histopathology. The task is paramount, given the role of tumor proliferation as a critical biomarker in determining the prognosis and treatment of breast cancer. Traditional methods, reliant on pathologist assessment, face challenges of subjectivity and manual intensity, thus driving the need for computational pathology solutions.

Challenge Structure and Tasks

The TUPAC16 challenge was divided into multiple tasks, with the primary ones being the prediction of mitotic scores and genomic PMS50 proliferation scores from WSIs. A dataset comprising 821 WSIs from The Cancer Genome Atlas served as the basis for this endeavor, with 500 images allocated to training and 321 to testing. Notably, the ground truth for the testing dataset was withheld from participants to ensure objective evaluation.

Methods and Approaches

The methodologies employed by participating teams predominantly featured deep convolutional neural networks (CNNs) as a fundamental component. Broadly, methodologies bifurcated into two paradigms: a traditional two-step approach involving region of interest (ROI) identification followed by mitosis detection, and a direct approach relying on region-level features without explicit mitosis detection.

Pre-processing steps commonly involved staining normalization to address variability in WSI appearance. ROI detection strategies varied across teams, with some leveraging convolutional autoencoders or supervised classification models trained on manually annotated ROIs. For mitosis detection, methods ranged from shallow CNNs to deeper architectures like ResNet, with additional techniques such as hard negative mining employed to counter data imbalance.

For score prediction, feature extraction from ROIs was a prevalent strategy, often coupled with classifiers or heuristic methods for slide-level score determination. These diversity in methodologies underscores the experimental nature of the challenge, reflecting the ongoing exploration of optimal strategies in computational pathology.

Results and Evaluation

In the mitotic score prediction task, the leading automatic approach, LUNIT, achieved a quadratic weighted kappa of κ = 0.567, indicating moderate agreement with the manual pathologist assessments. Methodological variations, such as staining normalization and different neural network architectures, underpinned these results. Meanwhile, in the gene expression-based PAM50 score prediction, the semi-automatic MICROSOFT approach demonstrated the highest Spearman correlation (r = 0.710), highlighting the potential advantage of hybrid models that integrate automatic algorithms with manual oversight.

Ensembling experiments revealed potential performance enhancements, with the averaged predictions of top methods yielding improvements in kappa statistics and correlation coefficients. This suggests that ensemble techniques may counterbalance individual model biases, a conclusion of practical import for ongoing research.

Implications and Future Directions

TUPAC16's results suggest that automatic models, while promising, have not yet reached the reliability needed to replace manual assessment or to serve as a conclusive second-opinion tool. The domain shift inherent in the dataset, coupled with the challenge of predicting multi-scale pathological phenomena from global slide features, underscores the complexity of the task.

The implications of this work emphasize the necessity for standardizing and modularizing future submissions to parse the effectiveness of distinct pipeline components. Domain adaptation remains a frontier challenge, necessitating methodologies capable of generalizing across varied staining and imaging conditions.

The observed success in predicting molecular scores directly from WSIs suggests a latent morphological signature that correlates with molecular alterations, warranting further exploration into the convergence of genomic and imaging data.

In sum, TUPAC16 has set a foundational benchmark in this domain and invites further research into enhanced image analysis pipelines, domain adaptation techniques, and the integration of molecular data for robust cancer prognostication. As breast cancer remains a significant clinical challenge, leveraging such automated tools can significantly streamline and enhance diagnostic workflows in the future.

PDF Markdown