Towards Large-Scale Training of Pathology Foundation Models (2404.15217v1)
Abstract: Driven by the recent advances in deep learning methods and, in particular, by the development of modern self-supervised learning algorithms, increased interest and efforts have been devoted to build foundation models (FMs) for medical images. In this work, we present our scalable training pipeline for large pathology imaging data, and a comprehensive analysis of various hyperparameter choices and training techniques for building pathology FMs. We release and make publicly available the first batch of our pathology FMs (https://github.com/kaiko-ai/towards_large_pathology_fms) trained on open-access TCGA whole slide images, a commonly used collection of pathology images. The experimental evaluation shows that our models reach state-of-the-art performance on various patch-level downstream tasks, ranging from breast cancer subtyping to colorectal nuclear segmentation. Finally, to unify the evaluation approaches used in the field and to simplify future comparisons of different FMs, we present an open-source framework (https://github.com/kaiko-ai/eva) designed for the consistent evaluation of pathology FMs across various downstream tasks.
- G. Campanella, M. G. Hanna, L. Geneslaw, A. Miraflor, V. Werneck Krauss Silva, K. J. Busam, E. Brogi, V. E. Reuter, D. S. Klimstra, and T. J. Fuchs, “Clinical-grade computational pathology using weakly supervised deep learning on whole slide images,” Nature medicine, vol. 25, no. 8, pp. 1301–1309, 2019.
- M. Y. Lu, D. F. Williamson, T. Y. Chen, R. J. Chen, M. Barbieri, and F. Mahmood, “Data-efficient and weakly supervised computational pathology on whole-slide images,” Nature biomedical engineering, vol. 5, no. 6, pp. 555–570, 2021.
- A. Echle, N. T. Rindtorff, T. J. Brinker, T. Luedde, A. T. Pearson, and J. N. Kather, “Deep learning in cancer pathology: a new generation of clinical biomarkers,” British journal of cancer, vol. 124, no. 4, pp. 686–696, 2021.
- D. Tellez, M. Balkenhol, I. Otte-Höller, R. van de Loo, R. Vogels, P. Bult, C. Wauters, W. Vreuls, S. Mol, N. Karssemeijer, et al., “Whole-slide mitosis detection in h&e breast histology using phh3 as a reference to train distilled stain-invariant convolutional networks,” IEEE transactions on medical imaging, vol. 37, no. 9, pp. 2126–2136, 2018.
- W. Bulten, H. Pinckaers, H. van Boven, R. Vink, T. de Bel, B. van Ginneken, J. van der Laak, C. Hulsbergen-van de Kaa, and G. Litjens, “Automated deep-learning system for gleason grading of prostate cancer using biopsies: a diagnostic study,” The Lancet Oncology, vol. 21, no. 2, pp. 233–241, 2020.
- G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. Van Der Laak, B. Van Ginneken, and C. I. Sánchez, “A survey on deep learning in medical image analysis,” Medical image analysis, vol. 42, pp. 60–88, 2017.
- J. Van der Laak, G. Litjens, and F. Ciompi, “Deep learning in histopathology: the path to the clinic,” Nature medicine, vol. 27, no. 5, pp. 775–784, 2021.
- N. Dimitriou, O. Arandjelović, and P. D. Caie, “Deep learning for whole slide image analysis: an overview,” Frontiers in medicine, vol. 6, p. 264, 2019.
- L. A. Hildebrand, C. J. Pierce, M. Dennis, M. Paracha, and A. Maoz, “Artificial intelligence for histology-based detection of microsatellite instability and prediction of response to immunotherapy in colorectal cancer,” Cancers, vol. 13, no. 3, p. 391, 2021.
- J. Zhu, W. Wu, Y. Zhang, S. Lin, Y. Jiang, R. Liu, H. Zhang, and X. Wang, “Computational analysis of pathological image enables interpretable prediction for microsatellite instability,” Frontiers in Oncology, vol. 12, p. 825353, 2022.
- C. Saillard, O. Dehaene, T. Marchand, O. Moindrot, A. Kamoun, B. Schmauch, and S. Jegou, “Self supervised learning improves dmmr/msi detection from histology slides across multiple cancers,” arXiv preprint arXiv:2109.05819, 2021.
- Z. R. McCaw, A. Shcherbina, Y. Shah, D. Huang, S. Elliott, P. M. Szabo, B. Dulken, S. Holland, P. Tagari, D. Light, et al., “Machine learning enabled prediction of digital biomarkers from whole slide histopathology images,” medRxiv, pp. 2024–01, 2024.
- O. S. El Nahhas, C. M. Loeffler, Z. I. Carrero, M. van Treeck, F. R. Kolbinger, K. J. Hewitt, H. S. Muti, M. Graziani, Q. Zeng, J. Calderaro, et al., “Regression-based deep-learning predicts molecular biomarkers from pathology slides,” Nature Communications, vol. 15, no. 1, p. 1253, 2024.
- B. Schmauch, A. Romagnoni, E. Pronier, C. Saillard, P. Maillé, J. Calderaro, A. Kamoun, M. Sefta, S. Toldo, M. Zaslavskiy, et al., “A deep learning model to predict rna-seq expression of tumours from whole slide images,” Nature communications, vol. 11, no. 1, p. 3877, 2020.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, pp. 2961–2969, 2017.
- J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, et al., “Gpt-4 technical report,” arXiv preprint arXiv:2303.08774, 2023.
- T. Chen, S. Kornblith, M. Norouzi, and G. E. Hinton, “A simple framework for contrastive learning of visual representations,” CoRR, vol. abs/2002.05709, 2020.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” CoRR, vol. abs/2103.00020, 2021.
- J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, “Barlow twins: Self-supervised learning via redundancy reduction,” CoRR, vol. abs/2103.03230, 2021.
- M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski, and A. Joulin, “Unsupervised learning of visual features by contrasting cluster assignments,” CoRR, vol. abs/2006.09882, 2020.
- M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 9650–9660, 2021.
- M. Oquab, T. Darcet, T. Moutakanni, H. V. Vo, M. Szafraniec, V. Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby, R. Howes, P.-Y. Huang, H. Xu, V. Sharma, S.-W. Li, W. Galuba, M. Rabbat, M. Assran, N. Ballas, G. Synnaeve, I. Misra, H. Jegou, J. Mairal, P. Labatut, A. Joulin, and P. Bojanowski, “Dinov2: Learning robust visual features without supervision,” 2023.
- O. Dehaene, A. Camara, O. Moindrot, A. de Lavergne, and P. Courtiol, “Self-supervision closes the gap between weak and strong supervision in histology,” arXiv preprint arXiv:2012.03583, 2020.
- R. J. Chen, C. Chen, Y. Li, T. Y. Chen, A. D. Trister, R. G. Krishnan, and F. Mahmood, “Scaling vision transformers to gigapixel images via hierarchical self-supervised learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16144–16155, 2022.
- K. Chang, C. J. Creighton, C. Davis, L. Donehower, J. Drummond, D. Wheeler, A. Ally, M. Balasundaram, I. Birol, Y. S. N. Butterfield, A. Chu, E. Chuah, H.-J. E. Chun, N. Dhalla, R. Guin, M. Hirst, C. Hirst, R. A. Holt, S. J. M. Jones, D. Lee, H. I. Li, M. A. Marra, M. Mayo, R. A. Moore, A. J. Mungall, A. G. Robertson, J. E. Schein, P. Sipahimalani, A. Tam, N. Thiessen, R. J. Varhol, R. Beroukhim, A. S. Bhatt, A. N. Brooks, A. D. Cherniack, S. S. Freeman, S. B. Gabriel, E. Helman, J. Jung, M. Meyerson, A. I. Ojesina, C. S. Pedamallu, G. Saksena, S. E. Schumacher, B. Tabak, T. Zack, E. S. Lander, C. A. Bristow, A. Hadjipanayis, P. Haseley, R. Kucherlapati, S. Lee, E. Lee, L. J. Luquette, H. S. Mahadeshwar, A. Pantazi, M. Parfenov, P. J. Park, A. Protopopov, X. Ren, N. Santoso, J. Seidman, S. Seth, X. Song, J. Tang, R. Xi, A. W. Xu, L. Yang, D. Zeng, J. T. Auman, S. Balu, E. Buda, C. Fan, K. A. Hoadley, C. D. Jones, S. Meng, P. A. Mieczkowski, J. S. Parker, C. M. Perou, J. Roach, Y. Shi, G. O. Silva, D. Tan, U. Veluvolu, S. Waring, M. D. Wilkerson, J. Wu, W. Zhao, T. Bodenheimer, D. N. Hayes, A. P. Hoyle, S. R. Jeffreys, L. E. Mose, J. V. Simons, M. G. Soloway, S. B. Baylin, B. P. Berman, M. S. Bootwalla, L. Danilova, J. G. Herman, T. Hinoue, P. W. Laird, S. K. Rhie, H. Shen, T. Triche, D. J. Weisenberger, S. L. Carter, K. Cibulskis, L. Chin, J. Zhang, G. Getz, C. Sougnez, M. Wang, H. Dinh, H. V. Doddapaneni, R. Gibbs, P. Gunaratne, Y. Han, D. Kalra, C. Kovar, L. Lewis, M. Morgan, D. Morton, D. Muzny, J. Reid, L. Xi, J. Cho, D. DiCara, S. Frazer, N. Gehlenborg, D. I. Heiman, J. Kim, M. S. Lawrence, P. Lin, Y. Liu, M. S. Noble, P. Stojanov, D. Voet, H. Zhang, L. Zou, C. Stewart, B. Bernard, R. Bressler, A. Eakin, L. Iype, T. Knijnenburg, R. Kramer, R. Kreisberg, K. Leinonen, J. Lin, Y. Liu, M. Miller, S. M. Reynolds, H. Rovira, I. Shmulevich, V. Thorsson, D. Yang, W. Zhang, S. Amin, C.-J. Wu, C.-C. Wu, R. Akbani, K. Aldape, K. A. Baggerly, B. Broom, T. D. Casasent, J. Cleland, C. Creighton, D. Dodda, M. Edgerton, L. Han, S. M. Herbrich, Z. Ju, H. Kim, S. Lerner, J. Li, H. Liang, W. Liu, P. L. Lorenzi, Y. Lu, J. Melott, G. B. Mills, L. Nguyen, X. Su, R. Verhaak, W. Wang, J. N. Weinstein, A. Wong, Y. Yang, J. Yao, R. Yao, K. Yoshihara, Y. Yuan, A. K. Yung, N. Zhang, S. Zheng, M. Ryan, D. W. Kane, B. A. Aksoy, G. Ciriello, G. Dresdner, J. Gao, B. Gross, A. Jacobsen, A. Kahles, M. Ladanyi, W. Lee, K.-V. Lehmann, M. L. Miller, R. Ramirez, G. Rätsch, B. Reva, C. Sander, N. Schultz, Y. Senbabaoglu, R. Shen, R. Sinha, S. O. Sumer, Y. Sun, B. S. Taylor, N. Weinhold, S. Fei, P. Spellman, C. Benz, D. Carlin, M. Cline, B. Craft, K. Ellrott, M. Goldman, D. Haussler, S. Ma, S. Ng, E. Paull, A. Radenbaugh, S. Salama, A. Sokolov, J. M. Stuart, T. Swatloski, V. Uzunangelov, P. Waltman, C. Yau, J. Zhu, S. R. Hamilton, S. Abbott, R. Abbott, N. D. Dees, K. Delehaunty, L. Ding, D. J. Dooling, J. M. Eldred, C. C. Fronick, R. Fulton, L. L. Fulton, J. Kalicki-Veizer, K.-L. Kanchi, C. Kandoth, D. C. Koboldt, D. E. Larson, T. J. Ley, L. Lin, C. Lu, V. J. Magrini, E. R. Mardis, M. D. McLellan, J. F. McMichael, C. A. Miller, M. O’Laughlin, C. Pohl, H. Schmidt, S. M. Smith, J. Walker, J. W. Wallis, M. C. Wendl, R. K. Wilson, T. Wylie, Q. Zhang, R. Burton, M. A. Jensen, A. Kahn, T. Pihl, D. Pot, Y. Wan, D. A. Levine, A. D. Black, J. Bowen, T. C. G. A. R. Network, G. C. Center, G. D. A. Center, S. Center, D. C. Center, T. S. Site, and B. C. R. Center, “The cancer genome atlas pan-cancer analysis project,” Nature Genetics, vol. 45, no. 10, pp. 1113–1120, 2013.
- M. Kang, H. Song, S. Park, D. Yoo, and S. Pereira, “Benchmarking self-supervised learning on diverse pathology datasets,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3344–3354, June 2023.
- A. Filiot, R. Ghermi, A. Olivier, P. Jacob, L. Fidon, A. M. Kain, C. Saillard, and J.-B. Schiratti, “Scaling self-supervised learning for histopathology with masked image modeling,” medRxiv, 2023.
- R. J. Chen, T. Ding, M. Y. Lu, D. F. K. Williamson, G. Jaume, A. H. Song, B. Chen, A. Zhang, D. Shao, M. Shaban, M. Williams, L. Oldenburg, L. L. Weishaupt, J. J. Wang, A. Vaidya, L. P. Le, G. Gerber, S. Sahai, W. Williams, and F. Mahmood, “Towards a general-purpose foundation model for computational pathology,” Nature Medicine, 2024.
- G. Campanella, C. Vanderbilt, and T. Fuchs, “Computational pathology at health system scale – self-supervised foundation models from billions of images,” in AAAI 2024 Spring Symposium on Clinical Foundation Models, 2024.
- E. Vorontsov, A. Bozkurt, A. Casson, G. Shaikovski, M. Zelechowski, S. Liu, K. Severson, E. Zimmermann, J. Hall, N. Tenenholtz, N. Fusi, P. Mathieu, A. van Eck, D. Lee, J. Viret, E. Robert, Y. K. Wang, J. D. Kunz, M. C. H. Lee, J. Bernhard, R. A. Godrich, G. Oakley, E. Millar, M. Hanna, J. Retamero, W. A. Moye, R. Yousfi, C. Kanan, D. Klimstra, B. Rothrock, and T. J. Fuchs, “Virchow: A million-slide digital pathology foundation model,” arXiv:2309.07778v5, 2024.
- A. Kolesnikov, L. Beyer, X. Zhai, J. Puigcerver, J. Yung, S. Gelly, and N. Houlsby, “Large scale learning of general visual representations for transfer,” CoRR, vol. abs/1912.11370, 2019.
- L. A. Donehower, T. Soussi, A. Korkut, Y. Liu, A. Schultz, M. Cardenas, X. Li, O. Babur, T.-K. Hsu, O. Lichtarge, J. N. Weinstein, R. Akbani, and D. A. Wheeler, “Integrated analysis of tp53 gene and pathway alterations in the cancer genome atlas.,” Cell Rep, vol. 28, pp. 1370–1384, Jul 2019.
- G. Aresta, T. Araújo, S. Kwok, S. S. Chennamsetty, M. Safwan, V. Alex, B. Marami, M. Prastawa, M. Chan, M. Donovan, G. Fernandez, J. Zeineh, M. Kohl, C. Walz, F. Ludwig, S. Braunewell, M. Baust, Q. D. Vu, M. N. N. To, E. Kim, J. T. Kwak, S. Galal, V. Sanchez-Freire, N. Brancati, M. Frucci, D. Riccio, Y. Wang, L. Sun, K. Ma, J. Fang, I. Kone, L. Boulmane, A. Campilho, C. Eloy, A. Polónia, and P. Aguiar, “Bach: Grand challenge on breast cancer histology images.,” Med Image Anal, vol. 56, pp. 122–139, Aug 2019.
- J. N. Kather, N. Halama, and A. Marx, “100,000 histological images of human colorectal cancer and healthy tissue,” May 2018.
- B. S. Veeling, J. Linmans, J. Winkens, T. Cohen, and M. Welling, “Rotation equivariant CNNs for digital pathology,” June 2018.
- J. Wei, A. Suriawinata, B. Ren, X. Liu, M. Lisovsky, L. Vaickus, C. Brown, M. Baker, N. Tomita, L. Torresani, J. Wei, and S. Hassanpour, “A petri dish for histopathology image analysis,” International Conference on Artificial Intelligence in Medicine (AIME), vol. 12721, pp. 11–24, 2021.
- S. Graham, Q. D. Vu, S. E. A. Raza, A. Azam, Y. W. Tsang, J. T. Kwak, and N. Rajpoot, “Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images,” Medical image analysis, vol. 58, p. 101563, 2019.
- H. Pinckaers, B. Van Ginneken, and G. Litjens, “Streaming convolutional neural networks for end-to-end learning with multi-megapixel images,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 3, pp. 1581–1590, 2020.
- S. Dooper, H. Pinckaers, W. Aswolinskiy, K. Hebeda, S. Jarkman, J. van der Laak, G. Litjens, B. Consortium, et al., “Gigapixel end-to-end training using streaming and attention,” Medical Image Analysis, vol. 88, p. 102881, 2023.
- A. Miles, jakirkham, M. Bussonnier, J. Moore, D. P. Orfanos, J. Bourbeau, A. Fulton, D. Bennett, G. Lee, S. Verma, Z. Patel, R. Abernathey, D. Stansby, M. R. B. Kristensen, M. Rocklin, A. B. AWA, J. Hamman, S. Chopra, E. S. de Andrade, M. Durant, V. Schut, raphael dussin, J. Nunez-Iglesias, C. Barnes, S. Chaudhary, shikharsg, hailiangzhang, and W. Gikunda, “zarr-developers/zarr-python: v2.17.1,” Mar. 2024.
- J. Moore, C. Allan, S. Besson, J.-M. Burel, E. Diel, D. Gault, K. Kozlowski, D. Lindner, M. Linkert, T. Manz, et al., “Ome-ngff: a next-generation file format for expanding bioimaging data-access strategies,” Nature methods, vol. 18, no. 12, pp. 1496–1498, 2021.
- J. Moore, D. Basurto-Lozada, S. Besson, J. Bogovic, J. Bragantini, E. M. Brown, J.-M. Burel, X. Casas Moreno, G. de Medeiros, E. E. Diel, et al., “Ome-zarr: a cloud-optimized bioimaging file format with international community support,” Histochemistry and Cell Biology, vol. 160, no. 3, pp. 223–251, 2023.
- G. HealthAI and G. C. H. teams, “Accelerate ai development for digital pathology using ez wsi dicomweb python library.”
- M. Kang, H. Song, S. Park, D. Yoo, and S. Pereira, “Benchmarking self-supervised learning on diverse pathology datasets,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3344–3354, 2023.
- Q. Garrido, R. Balestriero, L. Najman, and Y. Lecun, “RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank,” in Proceedings of the 40th International Conference on Machine Learning, vol. 202 of Proceedings of Machine Learning Research, pp. 10929–10974, PMLR, 23–29 Jul 2023.
- M. Cogswell, F. Ahmed, R. B. Girshick, L. Zitnick, and D. Batra, “Reducing overfitting in deep networks by decorrelating representations,” in 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.
- W. Zhou, B. Y. Lin, and X. Ren, “Isobn: Fine-tuning bert with isotropic batch normalization,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14621–14629, 2021.
- B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, and R. Girdhar, “Masked-attention mask transformer for universal image segmentation,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1280–1289, 2022.
- Z. Chen, Y. Duan, W. Wang, J. He, T. Lu, J. Dai, and Y. Qiao, “Vision transformer adapter for dense predictions,” 2022.
- X. Chen, H. Fan, R. B. Girshick, and K. He, “Improved baselines with momentum contrastive learning,” CoRR, vol. abs/2003.04297, 2020.
- J. Zhou, C. Wei, H. Wang, W. Shen, C. Xie, A. Yuille, and T. Kong, “ibot: Image bert pre-training with online tokenizer,” International Conference on Learning Representations (ICLR), 2022.
- A. Ghosh, A. K. Mondal, K. K. Agrawal, and B. Richards, “Investigating power laws in deep representation learning,” 2022.
- R. Wightman, “Pytorch image models.” https://github.com/rwightman/pytorch-image-models, 2019.
- A. Krizhevsky, “Learning multiple layers of features from tiny images,” tech. rep., University of Toronto, 2009.
- L. Bossard, M. Guillaumin, and L. Van Gool, “Food-101 – mining discriminative components with random forests,” in European Conference on Computer Vision, 2014.
- kaiko. ai (1 paper)
- Nanne Aben (4 papers)
- Edwin D. de Jong (5 papers)
- Ioannis Gatopoulos (4 papers)
- Nicolas Känzig (3 papers)
- Mikhail Karasikov (4 papers)
- Axel Lagré (1 paper)
- Roman Moser (1 paper)
- Joost van Doorn (2 papers)
- Fei Tang (29 papers)