Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification (1504.07947v5)

Published 29 Apr 2015 in cs.CV

Abstract: Convolutional Neural Networks (CNN) are state-of-the-art models for many image classification tasks. However, to recognize cancer subtypes automatically, training a CNN on gigapixel resolution Whole Slide Tissue Images (WSI) is currently computationally impossible. The differentiation of cancer subtypes is based on cellular-level visual features observed on image patch scale. Therefore, we argue that in this situation, training a patch-level classifier on image patches will perform better than or similar to an image-level classifier. The challenge becomes how to intelligently combine patch-level classification results and model the fact that not all patches will be discriminative. We propose to train a decision fusion model to aggregate patch-level predictions given by patch-level CNNs, which to the best of our knowledge has not been shown before. Furthermore, we formulate a novel Expectation-Maximization (EM) based method that automatically locates discriminative patches robustly by utilizing the spatial relationships of patches. We apply our method to the classification of glioma and non-small-cell lung carcinoma cases into subtypes. The classification accuracy of our method is similar to the inter-observer agreement between pathologists. Although it is impossible to train CNNs on WSIs, we experimentally demonstrate using a comparable non-cancer dataset of smaller images that a patch-based CNN can outperform an image-based CNN.

Citations (745)

Summary

  • The paper introduces a two-level patch-based CNN approach using an EM algorithm to identify and aggregate discriminative tissue patches.
  • It achieves high classification accuracy, notably 97% for distinguishing glioblastoma from low-grade gliomas and promising results on NSCLC tasks.
  • The approach offers a scalable, efficient solution for gigapixel WSI classification, potentially enhancing automated pathological diagnostics.

Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification

Overview

The paper by Hou et al. presents a novel methodology for classifying Whole Slide Tissue Images (WSIs) utilizing a patch-based Convolutional Neural Network (CNN) approach. Given the computational constraints associated with applying a CNN directly to gigapixel resolution WSIs, the authors advocate for an innovative two-level model. This approach first applies CNNs to high-resolution patches of the WSI and subsequently aggregates patch-level predictions using a decision fusion model.

Methodology

The authors highlight several challenges inherent in the direct application of CNNs to WSIs: significant downsampling that may lead to loss of discriminative details and the potential inefficiency in learning from multiple discriminative patterns within a single WSI. Their solution leverages an Expectation-Maximization (EM) based method that identifies and utilizes the spatial relationships of discriminative patches within the WSI.

Patch-level CNN Training

Initially, the methodology involves training a patch-level CNN, treating each patch as an independent instance. A hidden variable is associated with each patch to determine its discriminative power concerning the WSI. The EM algorithm iteratively fine-tunes this discriminative power by training the CNN on a subset of patches deemed discriminative, followed by applying spatial smoothing to refine the set of discriminative patches.

Decision Fusion Model

The second level involves aggregating patch-level predictions to make a slide-level classification. This is achieved using either a multi-class logistic regression or a Support Vector Machine (SVM). The authors argue that this decision-level fusion model is more robust than traditional max-pooling or voting methods, especially in capturing the heterogeneity inherent in complex cancer subtypes.

Numerical Results

The proposed model was applied to classify glioma and non-small-cell lung carcinoma (NSCLC) subtypes using the TCGA dataset. In the glioma classification task, the model achieved a classification accuracy of 77%, which is close to the inter-observer agreement of pathologists on similar tasks, reported at around 80%. The model was especially effective in distinguishing glioblastoma (GBM) from low-grade gliomas (LGG), with a notable accuracy of 97%.

For NSCLC classification, the proposed model achieved an accuracy comparable to the agreement rates between pulmonary pathology experts (κ=0.75\kappa = 0.75 for squamous cell carcinoma vs. non-squamous and κ=0.60\kappa = 0.60 for adenocarcinoma vs. non-adenocarcinoma). The classification of the mixed adenocarcinoma subtype (ADC-mix) demonstrated further the model's capacity to handle complexity, with superior performance compared to other baseline methods.

To further underscore the efficacy of the patch-based approach, a non-cancer-related task—rail surface defect severity classification—was also evaluated. The patch-based CNN methods outperformed conventional image-based methods significantly.

Implications and Future Work

The findings of this paper have significant implications for the field of computational pathology. Importantly, the use of patch-based CNNs and a structured decision fusion model offers a scalable and effective solution to WSI classification, addressing the computational challenges posed by gigapixel images. The proposed method also demonstrates versatility, being effective in both cancer-related and unrelated image classification tasks.

Theoretically, this research opens new avenues for employing multi-level learning frameworks in image classification, particularly in domains where discriminative information is dispersed across localized regions within extremely high-resolution images. Practically, the potential for automated and accurate classification of WSIs could markedly enhance pathological assessments, facilitating timely and precise diagnostics.

Future developments could involve leveraging non-discriminative patches in the EM formulation's data likelihood to capture more nuanced spatial relationships. Additionally, optimizing CNN-training protocols to manage larger datasets could further enhance the applicability of this method as digital pathology datasets continue to expand.

Conclusion

The paper by Hou et al. contributes a robust and innovative method for WSI classification, circumventing computational limitations through a hierarchical learning approach. The results underscore the promise of patch-based CNNs combined with a decision fusion model for accurate pathology image classification, showcasing performance that rivals that of human experts. This work provides a foundation for further advancements in automated pathology and multi-instance learning frameworks.