Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology (2408.00738v3)

Published 1 Aug 2024 in cs.CV

Abstract: Foundation models are rapidly being developed for computational pathology applications. However, it remains an open question which factors are most important for downstream performance with data scale and diversity, model size, and training algorithm all playing a role. In this work, we propose algorithmic modifications, tailored for pathology, and we present the result of scaling both data and model size, surpassing previous studies in both dimensions. We introduce three new models: Virchow2, a 632 million parameter vision transformer, Virchow2G, a 1.9 billion parameter vision transformer, and Virchow2G Mini, a 22 million parameter distillation of Virchow2G, each trained with 3.1 million histopathology whole slide images, with diverse tissues, originating institutions, and stains. We achieve state of the art performance on 12 tile-level tasks, as compared to the top performing competing models. Our results suggest that data diversity and domain-specific methods can outperform models that only scale in the number of parameters, but, on average, performance benefits from the combination of domain-specific methods, data scale, and model scale.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (14)
  1. Eric Zimmermann (11 papers)
  2. Eugene Vorontsov (19 papers)
  3. Julian Viret (5 papers)
  4. Adam Casson (5 papers)
  5. Michal Zelechowski (4 papers)
  6. George Shaikovski (5 papers)
  7. Neil Tenenholtz (14 papers)
  8. James Hall (8 papers)
  9. Thomas Fuchs (5 papers)
  10. Nicolo Fusi (26 papers)
  11. Siqi Liu (94 papers)
  12. Kristen Severson (7 papers)
  13. David Klimstra (3 papers)
  14. Razik Yousfi (5 papers)
Citations (9)

Summary

Virchow 2: Scaling Self-Supervised Mixed Magnification Models in Pathology

Eric Zimmermann et al. present an insightful advancement in computational pathology with the development of two foundation models: Virchow 2 and Virchow 2G. The models introduce significant improvements to data scale, model size, and domain-specific training adaptations within the self-supervised learning framework.

Overview of the Models and Methodologies

Virchow 2 and Virchow 2G are vision transformers (ViTs) tailored for computational pathology. Virchow 2, with 632 million parameters (ViT-H), and Virchow 2G, extending to 1.85 billion parameters (ViT-G), were trained on an expansive dataset containing 3.1 million whole slide images (WSIs). This dataset is notable for its scale and diversity, encompassing multiple institutions globally and including various staining techniques like hematoxylin and eosin (H&E) and immunohistochemistry (IHC).

Key Contributions and Domain-Specific Modifications

To train these extensive models efficiently, the authors proposed domain-specific modifications to the existing DINOv2 training algorithm, emphasizing pathology-specific data augmentations and regularization techniques. These adjustments include:

  • Extended-Context Translation (ECT): A novel approach to geometric augmentation that preserves cellular morphology by avoiding distortions associated with traditional crop-and-resize techniques.
  • Kernel Density Estimator (KDE) Regularization: Replacing the KoLeo regularizer with a KDE to enhance feature diversity without incurring the instability issues that arise when features are highly similar.

Evaluations and Performance Metrics

The models underwent rigorous evaluation on twelve tile-level tasks, both in-domain and out-of-domain, achieving state-of-the-art performance benchmarks. Particularly impressive results were seen in tasks such as PanMSK (at multiple magnifications), PCam, MHIST, CRC, and MIDOG, where Virchow models consistently outperformed existing models.

In-Distribution Tile-Level Benchmarks:

  • Virchow 2 significantly improved the average weighted F1 score from 0.944 (Virchow) to 0.966.
  • Virchow 2G further enhanced this average to 0.971, highlighting the scalability benefits of increasing the model parameters.

Out-of-Distribution Tile-Level Benchmarks:

  • Virchow 2 increased the average weighted F1 score from 0.877 (Virchow) to 0.885.
  • Virchow 2G further improved the score to 0.894, demonstrating robust generalization capabilities across various tasks.

Implications and Future Directions

These results underscore the potential of scaling both data and model size in computational pathology. The authors highlight the critical role of domain-specific adaptations, which can yield substantial performance gains even at smaller scales. The success of ECT and KDE regularization in particular shows promise for future applications of self-supervised learning in similar high-dimensional medical imaging domains.

Theoretical implications include reinforcing the importance of tailored augmentation strategies and regularization techniques in self-supervised learning, particularly in domains with inherently high redundancy and unique morphological features. Practically, the advancements made by Virchow models could pave the way for more robust and accurate diagnostic tools in pathology, potentially aiding in tasks such as disease subtyping, biomarker quantification, and survival prediction.

Moving forward, further exploration into model architectures and training methodologies tailored to specific pathology subdomains could yield even more refined models. Additionally, expanding the training dataset to include a broader range of tissue types and staining techniques could further enhance the generalizability and performance of such foundation models.

Conclusion

Eric Zimmermann and colleagues have made significant strides in the field of computational pathology with Virchow 2 and Virchow 2G. By scaling data and model parameters and introducing pathology-specific training adaptations, they have set new standards in tile-level tasks performance. These advancements underscore the ongoing potential for model and data scaling in furthering the efficacy and application range of foundation models in pathology.