CellViT: Vision Transformers for Precise Cell Segmentation and Classification (2306.15350v2)
Abstract: Nuclei detection and segmentation in hematoxylin and eosin-stained (H&E) tissue images are important clinical tasks and crucial for a wide range of applications. However, it is a challenging task due to nuclei variances in staining and size, overlapping boundaries, and nuclei clustering. While convolutional neural networks have been extensively used for this task, we explore the potential of Transformer-based networks in this domain. Therefore, we introduce a new method for automated instance segmentation of cell nuclei in digitized tissue samples using a deep learning architecture based on Vision Transformer called CellViT. CellViT is trained and evaluated on the PanNuke dataset, which is one of the most challenging nuclei instance segmentation datasets, consisting of nearly 200,000 annotated Nuclei into 5 clinically important classes in 19 tissue types. We demonstrate the superiority of large-scale in-domain and out-of-domain pre-trained Vision Transformers by leveraging the recently published Segment Anything Model and a ViT-encoder pre-trained on 104 million histological image patches - achieving state-of-the-art nuclei detection and instance segmentation performance on the PanNuke dataset with a mean panoptic quality of 0.50 and an F1-detection score of 0.83. The code is publicly available at https://github.com/TIO-IKIM/CellViT
- The global burden of cancer attributable to risk factors, 2010–19: a systematic analysis for the global burden of disease study 2019. The Lancet, 400(10352):563–591, August 2022. doi: 10.1016/s0140-6736(22)01438-6.
- Clinical significance of tumor-infiltrating lymphocytes in breast cancer. Journal for ImmunoTherapy of Cancer, 4(1), October 2016. doi: 10.1186/s40425-016-0165-6.
- Inflammation and cancer: Triggers, mechanisms, and consequences. Immunity, 51(1):27–41, July 2019. doi: 10.1016/j.immuni.2019.06.025.
- Spatially confined sub-tumor microenvironments in pancreatic cancer. Cell, 184(22):5577–5592.e18, October 2021. doi: 10.1016/j.cell.2021.09.022.
- Valuing vicinity: Memory attention framework for context-based semantic segmentation in histopathology. Computerized Medical Imaging and Graphics, 107:102238, July 2023. ISSN 08956111. doi: 10.1016/j.compmedimag.2023.102238.
- Data-efficient and weakly supervised computational pathology on whole-slide images. Nature Biomedical Engineering, 5(6):555–570, March 2021. doi: 10.1038/s41551-020-00682-w.
- Histology-based prediction of therapy response to neoadjuvant chemotherapy for esophageal and esophagogastric junction adenocarcinomas using deep learning. JCO Clinical Cancer Informatics, 2023. Forthcoming.
- Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Medical Image Analysis, 58:101563, 2019. ISSN 1361-8415. doi: 10.1016/j.media.2019.101563.
- TSFD-Net: Tissue specific feature distillation network for nuclei segmentation and classification. Neural Networks, 151:1–15, July 2022. ISSN 0893-6080. doi: 10.1016/j.neunet.2022.02.020.
- One model is all you need: Multi-task learning enables simultaneous histology image segmentation and classification. Medical Image Analysis, 83:102685, 2023. ISSN 1361-8415. doi: 10.1016/j.media.2022.102685.
- Novel digital signatures of tissue phenotypes for predicting distant metastasis in colorectal cancer. Scientific Reports, 8(1), September 2018. doi: 10.1038/s41598-018-31799-3.
- Training a cell-level classifier for detecting basal-cell carcinoma by combining human visual attention maps with low-level handcrafted features. Journal of Medical Imaging, 4(2):021105, March 2017. doi: 10.1117/1.jmi.4.2.021105.
- Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nature Medicine, 25(8):1301–1309, July 2019. doi: 10.1038/s41591-019-0508-1.
- CoNIC: Colon nuclei identification and counting challenge 2022. arXiv Preprint, November 2021. doi: 10.48550/arXiv.2111.14485.
- Extraction of informative cell features by segmentation of densely clustered tissue images. In 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, September 2009. doi: 10.1109/iembs.2009.5333810.
- Wie funktioniert radiomics? Der Radiologe, 60(1):32–41, December 2019. doi: 10.1007/s00117-019-00617-w.
- PanNuke dataset extension, insights and baselines. arXiv Preprint, April 2020. doi: 10.48550/arXiv.2003.10778.
- Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16144–16155, June 2022. doi: 10.1109/CVPR52688.2022.01567.
- Segment anything. arXiv Preprint, April 2023. doi: 10.48550/arXiv.2304.02643.
- UNETR: Transformers for 3D medical image segmentation. arXiv Preprint, October 2021. doi: 10.48550/arXiv.2103.10504.
- Nuclei segmentation using marker-controlled watershed, tracking using mean-shift, and kalman filter in time-lapse microscopy. IEEE Transactions on Circuits and Systems I: Regular Papers, 53(11):2405–2414, 2006. doi: 10.1109/TCSI.2006.884469.
- Applying watershed algorithms to the segmentation of clustered nuclei. Cytometry, 28(4):289–297, December 1998. doi: 10.1002/(sici)1097-0320(19970801)28:4<289::aid-cyto3>3.0.co;2-7.
- Multi-pass fast watershed for accurate segmentation of overlapping cervical cells. IEEE Transactions on Medical Imaging, 37(9):2044–2059, 2018. doi: 10.1109/TMI.2018.2815013.
- J. Cheng and J. C. Rajapakse. Segmentation of clustered nuclei with shape markers and marking function. IEEE Transactions on Biomedical Engineering, 56(3):741–748, 2009. doi: 10.1109/TBME.2008.2008635.
- Automatic nuclei segmentation in h&e stained breast cancer histopathology images. PLOS ONE, 8(7):null, 07 2013. doi: 10.1371/journal.pone.0070221.
- S. Ali and A. Madabhushi. An integrated region-, boundary-, shape-based active contour for multiple object overlap resolution in histological imagery. IEEE Transactions on Medical Imaging, 31(7):1448–1460, 2012. doi: 10.1109/TMI.2012.2190089.
- Detection and segmentation of cell nuclei in virtual microscopy images: A minimum-model approach. Scientific Reports, 2(1), July 2012. doi: 10.1038/srep00503.
- Automatic segmentation for cell images based on bottleneck detection and ellipse fitting. Neurocomputing, 173:615–622, January 2016. doi: 10.1016/j.neucom.2015.08.006.
- CPP-Net: Context-aware polygon proposal network for nucleus segmentation. IEEE Transactions on Image Processing, 32:980–994, 2023. ISSN 1941-0042. doi: 10.1109/TIP.2023.3237013.
- Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7):3523–3542, 2022. doi: 10.1109/TPAMI.2021.3059968.
- A guide to deep learning in healthcare. Nature Medicine, 25(1):24–29, January 2019. doi: 10.1038/s41591-018-0316-z.
- Deep learning. Nature, 521(7553):436–444, May 2015. doi: 10.1038/nature14539.
- U-net: Convolutional networks for biomedical image segmentation. In Lecture Notes in Computer Science, pages 234–241. Springer International Publishing, 2015. doi: 10.1007/978-3-319-24574-4_28.
- nnU-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature Methods, 18(2):203–211, December 2020. doi: 10.1038/s41592-020-01008-z.
- Radiology artificial intelligence: a systematic review and evaluation of methods (RAISE). European Radiology, 32(11):7998–8007, April 2022. doi: 10.1007/s00330-022-08784-6.
- U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access, 9:82031–82057, 2021. doi: 10.1109/ACCESS.2021.3086020.
- Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017.
- R. Girshick. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015. doi: 10.1109/ICCV.2017.322.
- Nuclear instance segmentation using a proposal-free spatially aware deep learning framework. In Lecture Notes in Computer Science, pages 622–630. Springer International Publishing, 2019. doi: 10.1007/978-3-030-32239-7_69.
- Accurate cervical cell segmentation from overlapping clumps in pap smear images. IEEE Transactions on Medical Imaging, 36(1):288–300, 2017. doi: 10.1109/TMI.2016.2606380.
- Micro-net: A unified model for segmentation of various objects in microscopy images. Medical Image Analysis, 52:160–173, February 2019. doi: 10.1016/j.media.2018.12.003.
- Segmentation of nuclei in histopathology images by deep regression of the distance map. IEEE Transactions on Medical Imaging, 38(2):448–459, 2019. doi: 10.1109/TMI.2018.2865709.
- M. Weigert and U. Schmidt. Nuclei Instance Segmentation and Classification in Histopathology Images with Stardist. In 2022 IEEE International Symposium on Biomedical Imaging Challenges (ISBIC), pages 1–4, March 2022. doi: 10.1109/ISBIC56247.2022.9854534.
- Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2018, pages 265–273. Springer International Publishing, 2018. doi: 10.1007/978-3-030-00934-2_30.
- Dcan: Deep contour-aware networks for accurate gland segmentation. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2487–2496. IEEE Computer Society, jun 2016. doi: 10.1109/CVPR.2016.273.
- Feature pyramid networks for object detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 936–944. IEEE Computer Society, jul 2017a. doi: 10.1109/CVPR.2017.106.
- Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017b. doi: 10.1109/tpami.2018.2858826.
- A. Nabila and N. M. Khan. A novel focal tversky loss function with improved attention u-net for lesion segmentation. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pages 683–687, 2019. doi: 10.1109/ISBI.2019.8759329.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv Preprint, June 2021. doi: 10.48550/arXiv.2010.11929.
- Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021. doi: 10.1109/ICCV48922.2021.00951.
- Do vision transformers see like convolutional neural networks? Advances in Neural Information Processing Systems, 34:12116–12128, 2021.
- Vit-yolo:transformer-based yolo for object detection. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 2799–2808, 2021. doi: 10.1109/ICCVW54120.2021.00314.
- L. Y. Chen and Q. Yu. Transformers make strong encoders for medical image segmentation. arXiv, February 2021. doi: 10.48550/arXiv.2102.04306.
- Medical image segmentation using squeeze-and-expansion transformers. arXiv, May 2021. doi: 10.48550/arXiv.2105.09511.
- Swin UNETR: Swin transformers for semantic segmentation of brain tumors in MRI images. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, pages 272–284. Springer International Publishing, 2022. doi: 10.1007/978-3-031-08999-2_22.
- Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34:12077–12090, 2021.
- Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6881–6890, 2021. doi: 10.1109/CVPR46437.2021.00681.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020. doi: 10.1109/CVPR42600.2020.00975.
- Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems, 33:9912–9924, 2020.
- Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
- X. Chen and K. He. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15750–15758, 2021. doi: 10.1109/CVPR46437.2021.01549.
- On the opportunities and risks of foundation models. arXiv, August 2021. doi: 10.48550/arXiv.2108.07258.
- QuPath: Open source software for digital pathology image analysis. Scientific Reports, 7(1), December 2017. doi: 10.1038/s41598-017-17204-5.
- A multi-organ nucleus segmentation challenge. IEEE Transactions on Medical Imaging, 39(5):1380–1391, 2020. doi: 10.1109/TMI.2019.2947628.
- A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE transactions on medical imaging, 36(7):1550–1560, 2017. doi: 10.1109/TMI.2017.2677499.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Panoptic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. doi: 10.1109/CVPR.2019.00963.
- Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images. IEEE Transactions on Medical Imaging, 35(5):1196–1206, 2016. doi: 10.1109/TMI.2016.2525803.
- Albumentations: fast and flexible image augmentations. Information, 11(2):125, 2020.
- Okunator. okunator/cellseg_models.pytorch: v0.1.23, 2022.
- I. Loshchilov and F. Hutter. Decoupled weight decay regularization. arXiv, November 2017. doi: 10.48550/arXiv.1711.05101.