Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification (1504.07947v5)

Published 29 Apr 2015 in cs.CV

Abstract: Convolutional Neural Networks (CNN) are state-of-the-art models for many image classification tasks. However, to recognize cancer subtypes automatically, training a CNN on gigapixel resolution Whole Slide Tissue Images (WSI) is currently computationally impossible. The differentiation of cancer subtypes is based on cellular-level visual features observed on image patch scale. Therefore, we argue that in this situation, training a patch-level classifier on image patches will perform better than or similar to an image-level classifier. The challenge becomes how to intelligently combine patch-level classification results and model the fact that not all patches will be discriminative. We propose to train a decision fusion model to aggregate patch-level predictions given by patch-level CNNs, which to the best of our knowledge has not been shown before. Furthermore, we formulate a novel Expectation-Maximization (EM) based method that automatically locates discriminative patches robustly by utilizing the spatial relationships of patches. We apply our method to the classification of glioma and non-small-cell lung carcinoma cases into subtypes. The classification accuracy of our method is similar to the inter-observer agreement between pathologists. Although it is impossible to train CNNs on WSIs, we experimentally demonstrate using a comparable non-cancer dataset of smaller images that a patch-based CNN can outperform an image-based CNN.

Authors (6)

Le Hou (36 papers)
Dimitris Samaras (125 papers)
Tahsin M. Kurc (10 papers)
Yi Gao (77 papers)
James E. Davis (1 paper)
Joel H. Saltz (20 papers)

Citations (745)

View on Semantic Scholar

Summary

The paper introduces a two-level patch-based CNN approach using an EM algorithm to identify and aggregate discriminative tissue patches.
It achieves high classification accuracy, notably 97% for distinguishing glioblastoma from low-grade gliomas and promising results on NSCLC tasks.
The approach offers a scalable, efficient solution for gigapixel WSI classification, potentially enhancing automated pathological diagnostics.

Patch-based Convolutional Neural Network for Whole Slide Tissue Image Classification

Overview

The paper by Hou et al. presents a novel methodology for classifying Whole Slide Tissue Images (WSIs) utilizing a patch-based Convolutional Neural Network (CNN) approach. Given the computational constraints associated with applying a CNN directly to gigapixel resolution WSIs, the authors advocate for an innovative two-level model. This approach first applies CNNs to high-resolution patches of the WSI and subsequently aggregates patch-level predictions using a decision fusion model.

Methodology

The authors highlight several challenges inherent in the direct application of CNNs to WSIs: significant downsampling that may lead to loss of discriminative details and the potential inefficiency in learning from multiple discriminative patterns within a single WSI. Their solution leverages an Expectation-Maximization (EM) based method that identifies and utilizes the spatial relationships of discriminative patches within the WSI.

Patch-level CNN Training

Initially, the methodology involves training a patch-level CNN, treating each patch as an independent instance. A hidden variable is associated with each patch to determine its discriminative power concerning the WSI. The EM algorithm iteratively fine-tunes this discriminative power by training the CNN on a subset of patches deemed discriminative, followed by applying spatial smoothing to refine the set of discriminative patches.

Decision Fusion Model

The second level involves aggregating patch-level predictions to make a slide-level classification. This is achieved using either a multi-class logistic regression or a Support Vector Machine (SVM). The authors argue that this decision-level fusion model is more robust than traditional max-pooling or voting methods, especially in capturing the heterogeneity inherent in complex cancer subtypes.

Numerical Results

The proposed model was applied to classify glioma and non-small-cell lung carcinoma (NSCLC) subtypes using the TCGA dataset. In the glioma classification task, the model achieved a classification accuracy of 77%, which is close to the inter-observer agreement of pathologists on similar tasks, reported at around 80%. The model was especially effective in distinguishing glioblastoma (GBM) from low-grade gliomas (LGG), with a notable accuracy of 97%.

For NSCLC classification, the proposed model achieved an accuracy comparable to the agreement rates between pulmonary pathology experts ( $\kappa = 0.75$ for squamous cell carcinoma vs. non-squamous and $\kappa = 0.60$ for adenocarcinoma vs. non-adenocarcinoma). The classification of the mixed adenocarcinoma subtype (ADC-mix) demonstrated further the model's capacity to handle complexity, with superior performance compared to other baseline methods.

To further underscore the efficacy of the patch-based approach, a non-cancer-related task—rail surface defect severity classification—was also evaluated. The patch-based CNN methods outperformed conventional image-based methods significantly.

Implications and Future Work

The findings of this paper have significant implications for the field of computational pathology. Importantly, the use of patch-based CNNs and a structured decision fusion model offers a scalable and effective solution to WSI classification, addressing the computational challenges posed by gigapixel images. The proposed method also demonstrates versatility, being effective in both cancer-related and unrelated image classification tasks.

Theoretically, this research opens new avenues for employing multi-level learning frameworks in image classification, particularly in domains where discriminative information is dispersed across localized regions within extremely high-resolution images. Practically, the potential for automated and accurate classification of WSIs could markedly enhance pathological assessments, facilitating timely and precise diagnostics.

Future developments could involve leveraging non-discriminative patches in the EM formulation's data likelihood to capture more nuanced spatial relationships. Additionally, optimizing CNN-training protocols to manage larger datasets could further enhance the applicability of this method as digital pathology datasets continue to expand.

Conclusion

The paper by Hou et al. contributes a robust and innovative method for WSI classification, circumventing computational limitations through a hierarchical learning approach. The results underscore the promise of patch-based CNNs combined with a decision fusion model for accurate pathology image classification, showcasing performance that rivals that of human experts. This work provides a foundation for further advancements in automated pathology and multi-instance learning frameworks.

PDF Markdown