Weakly supervised multiple instance learning histopathological tumor segmentation (2004.05024v4)

Published 10 Apr 2020 in eess.IV, cs.CV, and cs.LG

Abstract: Histopathological image segmentation is a challenging and important topic in medical imaging with tremendous potential impact in clinical practice. State of the art methods rely on hand-crafted annotations which hinder clinical translation since histology suffers from significant variations between cancer phenotypes. In this paper, we propose a weakly supervised framework for whole slide imaging segmentation that relies on standard clinical annotations, available in most medical systems. In particular, we exploit a multiple instance learning scheme for training models. The proposed framework has been evaluated on multi-locations and multi-centric public data from The Cancer Genome Atlas and the PatchCamelyon dataset. Promising results when compared with experts' annotations demonstrate the potentials of the presented approach. The complete framework, including $6481$ generated tumor maps and data processing, is available at https://github.com/marvinler/tcga_segmentation.

Citations (84)

View on Semantic Scholar

Summary

The paper proposes a weakly supervised multiple instance learning framework for automated histopathological tumor segmentation using only slide-level labels.
The model achieved robust performance on in-distribution and out-of-distribution data, with an AUC of 0.804, demonstrating its generalization capability.
The research offers a scalable approach to reduce annotation costs in digital pathology, enhancing diagnostic efficiency and clinical adoption.

Weakly Supervised Multiple Instance Learning for Histopathological Tumor Segmentation

The research paper titled "Weakly supervised multiple instance learning histopathological tumor segmentation" presents a novel approach for the automatic segmentation of histopathological images using weakly supervised learning methods. This paper addresses the critical challenge in medical imaging caused by the substantial variability in tissue phenotype and the high cost of detailed annotations necessary for fully supervised learning paradigms.

Key Contributions

The paper introduces a framework that applies multiple instance learning (MIL) to exploit coarser, readily available clinical annotations for the training of segmentation models. The framework leverages standard clinical data to sidestep the need for exhaustive pixel-level annotations, which are costly and time-consuming to produce. Instead, it utilizes slide-level labels, marking slides as containing tumor tissue or not, to train models that can perform instance-level predictions. This approach is not only computationally efficient but has significant implications for enhancing clinical workflows by providing more scalable and adaptable tumor screening solutions in digital pathology settings.

Methodology

Multiple Instance Learning Approach: The authors employed a MIL framework suited for histopathological analysis, where slides are classified collectively based on their constituent patches. Each slide is termed as a 'bag' containing multiple patches or 'instances.' The deep learning model is trained to classify these bags based on instance-level embeddings aggregated into overall slide-level predictions.
Weak Supervision: The innovative aspect of this work is the weak supervision, using binary presence-absence labels for tumor tissue to derive training signals. The method involves generating proxy labels for patches within each slide, balancing signals with parameters that control the minimum proportion of patches labeled as tumors versus normals.
Training and Evaluation: The model architecture is based on ResNet50, known for its adaptability in image-based models. The framework has been tested extensively on multi-location data sourced from The Cancer Genome Atlas (TCGA) and PatchCamelyon datasets, optimizing across various configurations of the labeling threshold to account for most accurate segmentation performance.

Results and Implications

The results demonstrate robust performance on both purely in-distribution data and challenging out-of-distribution testing sets. The best-performing configuration achieved an AUC of 0.804, indicating a strong alignment with expert-provided ground truth labels. Moreover, the model demonstrates reasonable performance on various datasets, underscoring its generalization capability.

The implications of this research are manifold. Practically, the use of weakly supervised MIL models could substantially reduce the time and cost associated with manual tissue annotations, facilitating broader adoption of digital pathology tools in clinical settings. The segmentation maps produced can enhance diagnostic efficiency and support the development of new predictive biomarkers, potentially informing treatment options based on individual tumor invasiveness patterns.

Future Prospects

This research paves the way for further exploration into weakly supervised learning in medical imaging, especially in fields heavily reliant on histological data. Future work could enhance these models by integrating more diverse datasets or combining MIL frameworks with attention-driven or transformer-based architectures for even finer-grained segmentation accuracy. Additionally, integrating domain knowledge into the network architecture and training processes could yield further improvements in model interpretability and performance.

In conclusion, this paper illustrates the considerable potential for weakly supervised learning to impact the field of digital histopathology significantly, addressing key barriers to clinical translation while providing a flexible and scalable framework for automatic tumor segmentation.

PDF Markdown

Related Papers

GitHub

GitHub - MarvinLer/tcga_segmentation: Whole Slide Image segmentation with weakly supervised multiple instance learning on TCGA | MICCAI2020 https://arxiv.org/abs/2004.05024 (124 stars)