Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification (2303.08446v1)

Published 15 Mar 2023 in cs.CV

Abstract: While Multiple Instance Learning (MIL) has shown promising results in digital Pathology Whole Slide Image (WSI) classification, such a paradigm still faces performance and generalization problems due to challenges in high computational costs on Gigapixel WSIs and limited sample size for model training. To deal with the computation problem, most MIL methods utilize a frozen pretrained model from ImageNet to obtain representations first. This process may lose essential information owing to the large domain gap and hinder the generalization of model due to the lack of image-level training-time augmentations. Though Self-supervised Learning (SSL) proposes viable representation learning schemes, the improvement of the downstream task still needs to be further explored in the conversion from the task-agnostic features of SSL to the task-specifics under the partial label supervised learning. To alleviate the dilemma of computation cost and performance, we propose an efficient WSI fine-tuning framework motivated by the Information Bottleneck theory. The theory enables the framework to find the minimal sufficient statistics of WSI, thus supporting us to fine-tune the backbone into a task-specific representation only depending on WSI-level weak labels. The WSI-MIL problem is further analyzed to theoretically deduce our fine-tuning method. Our framework is evaluated on five pathology WSI datasets on various WSI heads. The experimental results of our fine-tuned representations show significant improvements in both accuracy and generalization compared with previous works. Source code will be available at https://github.com/invoker-LL/WSI-finetuning.

Citations (49)

View on Semantic Scholar

Summary

The paper introduces a Variational Information Bottleneck (VIB) framework for efficient, task-specific fine-tuning of models on gigapixel pathology Whole Slide Images (WSIs) using weak supervision.
A key methodology involves learning a Bernoulli latent variable mask to reduce the number of WSI instances from over 10,000 to approximately 1,000, significantly lowering computational costs.
Evaluated on multiple datasets, the proposed method improved accuracy and generalization compared to traditional MIL methods, demonstrating better integration with self-supervised learning and reduced annotation requirements.

Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification

The paper "Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification" addresses the challenges in pathology WSI classification, primarily by focusing on improving computational efficiency and enhancing representation learning through fine-tuning methodologies. The research introduces an innovative fine-tuning framework built upon the Variational Information Bottleneck (VIB) theory, aiming to optimize the task-specific representation learning for WSIs, which are typically restrained by their computationally intensive nature due to their gigapixel resolution.

Overview

Whole Slide Images (WSIs) in digital pathology present difficulties in computation and annotation due to their massive size. Traditional Multiple Instance Learning (MIL) approaches offer a means to manage these issues by leveraging aggregated weakly-supervised labels. However, existing methods typically rely on pretrained models, such as those trained on ImageNet, which may not suit the specific domain of WSIs, resulting in potential losses of crucial information and suboptimal generalization.

The authors propose a novel fine-tuning approach using the VIB principle to address these issues. This mechanism focuses on deriving task-specific representations while maintaining computational efficiency. The VIB framework allows for selectively compressing WSI data to distill task-relevant features, thereby improving the end-to-end training capability of the model within the limits of practical resource allocation.

Methodology

Information Bottleneck Framework: The framework employs the VIB theory to optimize the encoding of relevant information while filtering out non-essential data points. This results in an efficient fine-tuning process that enhances task-specific feature extraction.
Instance Masking: The WSI instances are distilled through a learned Bernoulli latent variable masking system, which sparsifies the input data based on feature relevance. This method reduces the number of instances from over 10,000 to approximately 1,000 per WSI, drastically lowering the computational load.
Pipeline Configuration: The method involves a three-stage process:
- Stage 1 utilizes the IB module to identify and retain pertinent instances.
- Stage 2 performs end-to-end fine-tuning on the identified sets, further refining the specificity of the representations.
- Stage 3 combines all fine-tuned features to train the classification model.

Experimental Results

The proposed model was rigorously evaluated on five WSI datasets: Camelyon-16, TCGA-BRCA, and LBP-CECA, along with two variations to test domain robustness. Across all datasets, the framework showed notable improvements in both accuracy and generalization. The fine-tuned model demonstrated better integration with Self-Supervised Learning (SSL) techniques, surpassing results from standard MIL architectures by using substantially fewer annotations.

Implications and Future Work

This work significantly advances computational pathology by presenting a scalable solution that optimizes fine-tuning on WSIs, suggesting broad applicability across various domains requiring nuanced image analysis. The combination of VIB with SSL indicates a promising direction for further exploration in task-oriented feature optimization. Future research could delve into the broader application of this method across other large-scale data tasks and the potential integration with emerging deep learning technologies.

In conclusion, the paper provides a solid foundation for enhancing pathology image analysis through task-specific fine-tuning, paving the way for more efficient and accurate computational diagnostic methods. The conceptual and technical insights from this research will likely influence future developments in AI-driven pathology.

Related Papers

GitHub

GitHub - invoker-LL/WSI-finetuning: This is the official repository for our CVPR 2023 paper 'Task-Specific Fine-Tuning via Variational Information Bottleneck for Weakly-Supervised Pathology Whole Slide Image Classification'. (64 stars)