Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images (2505.06710v1)

Published 10 May 2025 in cs.CV

Abstract: Various multi-instance learning (MIL) based approaches have been developed and successfully applied to whole-slide pathological images (WSI). Existing MIL methods emphasize the importance of feature aggregators, but largely neglect the instance-level representation learning. They assume that the availability of a pre-trained feature extractor can be directly utilized or fine-tuned, which is not always the case. This paper proposes to pre-train feature extractor for MIL via a weakly-supervised scheme, i.e., propagating the weak bag-level labels to the corresponding instances for supervised learning. To learn effective features for MIL, we further delve into several key components, including strong data augmentation, a non-linear prediction head and the robust loss function. We conduct experiments on common large-scale WSI datasets and find it achieves better performance than other pre-training schemes (e.g., ImageNet pre-training and self-supervised learning) in different downstream tasks. We further show the compatibility and scalability of the proposed scheme by deploying it in fine-tuning the pathological-specific models and pre-training on merged multiple datasets. To our knowledge, this is the first work focusing on the representation learning for MIL.

Summary

SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

The paper presents SimMIL, a universal framework aimed at enhancing the representation learning of multi-instance learning (MIL) in the analysis of whole-slide pathological images (WSI). Traditional approaches in MIL focus on feature aggregators and insufficiently address representation learning at the instance level, relying heavily on previously trained feature extractors. SimMIL proposes a novel weakly supervised pre-training methodology that assigns bag-level labels to instances, facilitating supervised learning and addressing the inadequacies of existing frameworks.

SimMIL introduces several components to improve MIL's representation quality, including strong data augmentation strategies, non-linear prediction heads, and robust loss functions. Experimental results on standard WSI datasets reveal that pre-training with SimMIL achieved performance improvements over existing schemes, such as ImageNet pre-training and self-supervised learning, across multiple MIL tasks. This research emphasizes the necessity of customized pre-training schemes for MIL, which are validated by exploring compatibility and scalability through fine-tuning of pathology-specific models and pre-training on merged datasets.

Key results demonstrate that pre-training with SimMIL leads to enhanced bag-level classification in both benign-malignant classification and cancer subtyping tasks, surpassing the performance seen with traditional pre-training models. This assertion is backed by numerical results showing superior accuracy and AUC scores when applying various aggregation networks in downstream MIL tasks. Additionally, the paper explored survival prediction tasks, further indicating the broad applicability and efficacy of the SimMIL framework.

Moreover, SimMIL's compatibility with existing self-supervised learning methods was tested, revealing that further fine-tuning using SimMIL can improve model performance, showcasing its utility in potentially expediting computational WSI analysis. The scalability of SimMIL is underscored by experiments demonstrating that larger datasets generated by merging different sources improve downstream performance, indicating a promising avenue for universal application across pathology subtypes.

The SimMIL framework's reliance on weak supervision allows it to overcome limitations associated with noisy labels and challenges inherent to the high resolution and complex nature of WSIs, potentially facilitating more robust and efficient computational pathology solutions. Future research directions could explore the implementation of additional task-specific pretext tasks within the SimMIL framework to further enhance its practical applications in WSI analysis.

In summary, the paper significantly contributes to advancing the field of computational pathology by proposing and validating a reproducible framework that enriches instance-level representation learning in MIL. The implications of SimMIL are substantial and invite further exploration into its integration with various machine learning paradigms and broader applications in medical imaging.