TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification (2106.00908v2)

Published 2 Jun 2021 in cs.CV

Abstract: Multiple instance learning (MIL) is a powerful tool to solve the weakly supervised classification in whole slide image (WSI) based pathology diagnosis. However, the current MIL methods are usually based on independent and identical distribution hypothesis, thus neglect the correlation among different instances. To address this problem, we proposed a new framework, called correlated MIL, and provided a proof for convergence. Based on this framework, we devised a Transformer based MIL (TransMIL), which explored both morphological and spatial information. The proposed TransMIL can effectively deal with unbalanced/balanced and binary/multiple classification with great visualization and interpretability. We conducted various experiments for three different computational pathology problems and achieved better performance and faster convergence compared with state-of-the-art methods. The test AUC for the binary tumor classification can be up to 93.09% over CAMELYON16 dataset. And the AUC over the cancer subtypes classification can be up to 96.03% and 98.82% over TCGA-NSCLC dataset and TCGA-RCC dataset, respectively. Implementation is available at: https://github.com/szc19990412/TransMIL.

PDF Abstract

Transformer-Based Correlated Multiple Instance Learning for Whole Slide Image Classification

The paper "TransMIL: Transformer-based Correlated Multiple Instance Learning for Whole Slide Image Classification" introduces an innovative approach to whole slide image (WSI) classification, leveraging the capabilities of Transformer models within a correlated Multiple Instance Learning (MIL) framework. The proposed method aims to address the challenges inherent in digital pathology, such as the vast size of WSIs and the absence of pixel-level annotations.

Methodology

The researchers propose a novel framework called correlated MIL, which challenges the conventional independent and identical distribution (i.i.d.) assumption implicit in many MIL methods. By identifying and exploiting correlations between different instances within a bag, the method aligns with the practices of pathologists, who utilize both contextual and correlation information in diagnosis.

Central to the approach is the Transformer-based MIL model, TransMIL, which incorporates both morphological and spatial information across instances. Utilizing the self-attention mechanism of Transformers, TransMIL is capable of modeling complex correlations between instances, overcoming the limitations associated with traditional attention mechanisms which focus only on high-scoring instances.

Furthermore, the paper introduces a Pyramid Position Encoding Generator (PPEG). This component enhances the model's ability to encode spatial information, allowing the exploration of positional relationships between instances at multiple granularities.

Experimental Findings

Extensive experiments were conducted on three popular datasets: CAMELYON16, TCGA-NSCLC, and TCGA-RCC. TransMIL demonstrated superior performance compared to traditional and state-of-the-art methods, achieving high levels of accuracy and AUC scores. Key results include an AUC of 93.09% for binary tumor classification on CAMELYON16, 96.03% on the TCGA-NSCLC dataset, and 98.82% on the TCGA-RCC dataset. These outcomes underscore the efficacy of incorporating correlated instance information within the MIL framework.

Implications and Future Directions

The implications of this research are significant for both theoretical exploration and practical application in digital pathology. By improving interpretability and performance in WSI classification, TransMIL presents a compelling alternative to existing methods. As the use of larger-scale WSIs becomes more prevalent, future work may need to address computational efficiency further, particularly regarding memory usage for very large datasets with high magnification.

The correlated MIL framework also opens avenues for applying this approach to other domains where instances within a bag might exhibit complex interdependencies. This could include applications in areas like video analysis and other sequence-oriented deep learning tasks.

In conclusion, TransMIL represents a substantial contribution to the use of Transformers within MIL contexts, effectively bridging a gap between pathology practices and machine learning techniques. Its success suggests a promising direction for future research and deployment in real-world diagnostic applications.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Zhuchen Shao (6 papers)
Hao Bian (8 papers)
Yang Chen (535 papers)
Yifeng Wang (36 papers)
Jian Zhang (543 papers)
Xiangyang Ji (159 papers)
Yongbing Zhang (58 papers)

Citations (519)

View on Semantic Scholar

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification (2106.00908v2)

Transformer-Based Correlated Multiple Instance Learning for Whole Slide Image Classification

Methodology

Experimental Findings

Implications and Future Directions

Related Papers