AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images (2303.00865v2)
Abstract: Processing giga-pixel whole slide histopathology images (WSI) is a computationally expensive task. Multiple instance learning (MIL) has become the conventional approach to process WSIs, in which these images are split into smaller patches for further processing. However, MIL-based techniques ignore explicit information about the individual cells within a patch. In this paper, by defining the novel concept of shared-context processing, we designed a multi-modal Graph Transformer (AMIGO) that uses the celluar graph within the tissue to provide a single representation for a patient while taking advantage of the hierarchical structure of the tissue, enabling a dynamic focus between cell-level and tissue-level information. We benchmarked the performance of our model against multiple state-of-the-art methods in survival prediction and showed that ours can significantly outperform all of them including hierarchical Vision Transformer (ViT). More importantly, we show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data. Finally, in two different cancer datasets, we demonstrated that our model was able to stratify the patients into low-risk and high-risk groups while other state-of-the-art methods failed to achieve this goal. We also publish a large dataset of immunohistochemistry images (InUIT) containing 1,600 tissue microarray (TMA) cores from 188 patients along with their survival information, making it one of the largest publicly available datasets in this context.
- Representation learning of histopathology images using graph neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 988–989, 2020.
- The logrank test. Bmj, 328(7447):1073, 2004.
- Self-supervised representation learning using visual field expansion on digital pathology. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 639–647, 2021.
- Approximating cnns with bag-of-local-features models works surprisingly well on imagenet. arXiv preprint arXiv:1904.00760, 2019.
- Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nature medicine, 25(8):1301–1309, 2019.
- Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16144–16155, 2022.
- Whole slide images are 2d point clouds: Context-aware survival prediction using patch-based graph convolutional networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 339–349. Springer, 2021.
- Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Transactions on Medical Imaging, 2020.
- Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell, 40(8):865–878, 2022.
- Multi stain graph fusion for multimodal integration in pathology. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1835–1845, 2022.
- Multiple instance captioning: Learning representations from histopathology textbooks and articles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16549–16559, 2021.
- Deepliif: Deep learning-inferred multiplex immunofluorescence for ihc image quantification. bioRxiv, 2021.
- Node-aligned graph convolutional network for whole-slide image representation and classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18813–18823, 2022.
- Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022.
- Efficient multiple instance convolutional neural networks for gigapixel resolution image classification. arXiv preprint arXiv:1504.07947, 7:174–182, 2015.
- Attention-based deep multiple instance learning. In International conference on machine learning, pages 2127–2136. PMLR, 2018.
- Cellular community detection for tissue phenotyping in colorectal cancer histology images. Medical image analysis, 63:101696, 2020.
- Constrained deep weak supervision for histopathology image segmentation. IEEE transactions on medical imaging, 36(11):2376–2388, 2017.
- Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC medical research methodology, 18(1):1–12, 2018.
- Self-attention graph pooling. In International conference on machine learning, pages 3734–3743. PMLR, 2019.
- Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14318–14328, 2021.
- A multi-resolution model for histopathology image classification and localization with multiple instance learning. Computers in biology and medicine, 131:104253, 2021.
- Capturing cellular topology in multi-gigapixel pathology images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 260–261, 2020.
- Hypothesis-free deep survival learning applied to the tumour microenvironment in gastric cancer. The Journal of Pathology: Clinical Research, 6(4):273–282, 2020.
- Predictive models of response to neoadjuvant chemotherapy in muscle-invasive bladder cancer using nuclear morphology and tissue architecture. Cell Reports Medicine, 2(9):100382, 2021.
- Ccrl: Contrastive cell representation learning. In Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII, pages 397–407. Springer, 2023.
- Hierarchical graph representations in digital pathology. Medical image analysis, 75:102264, 2022.
- Every annotation counts: Multi-label deep supervision for medical image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9532–9542, 2021.
- Cluster-to-conquer: A framework for end-to-end multi-instance learning for whole slide image classification. In Medical Imaging with Deep Learning, pages 682–698. PMLR, 2021.
- The role of tumor microenvironment in therapeutic resistance. Oncotarget, 8(3):3933, 2017.
- Graph posterior network: Bayesian predictive uncertainty for node classification. Advances in Neural Information Processing Systems, 34:18033–18048, 2021.
- On ranking in survival analysis: Bounds on the concordance index. Advances in neural information processing systems, 20, 2007.
- Adversarial graph augmentation to improve graph contrastive learning. Advances in Neural Information Processing Systems, 34:15920–15933, 2021.
- Contig: Self-supervised multimodal contrastive learning for medical imaging with genetics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20908–20921, 2022.
- Mutual crf-gnn for few-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2329–2339, 2021.
- Self-supervised pre-training of swin transformers for 3d medical image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20730–20740, 2022.
- Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.
- Long-term cancer survival prediction using multimodal deep learning. Scientific Reports, 11(1):1–12, 2021.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Cell graph neural networks enable the precise prediction of patient survival in gastric cancer. NPJ precision oncology, 6(1):1–12, 2022.
- Genomic consequences of aberrant dna repair mechanisms stratify ovarian cancer histotypes. Nature genetics, 49(6):856–865, 2017.
- Closing the generalization gap of cross-silo federated medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20866–20875, 2022.
- How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
- Yinyin Yuan. Spatial heterogeneity in the tumor microenvironment. Cold Spring Harbor perspectives in medicine, 6(8):a026583, 2016.
- Deep sets. Advances in neural information processing systems, 30, 2017.
- Dtfd-mil: Double-tier feature distillation multiple instance learning for histopathology whole slide image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18802–18812, 2022.
- Encoding histopathological wsis using gnn for scalable diagnostically relevant regions retrieval. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 550–558. Springer, 2019.