Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Learning-based Computational Pathology Predicts Origins for Cancers of Unknown Primary (2006.13932v2)

Published 24 Jun 2020 in q-bio.TO, cs.LG, and q-bio.QM

Abstract: Cancer of unknown primary (CUP) is an enigmatic group of diagnoses where the primary anatomical site of tumor origin cannot be determined. This poses a significant challenge since modern therapeutics such as chemotherapy regimen and immune checkpoint inhibitors are specific to the primary tumor. Recent work has focused on using genomics and transcriptomics for identification of tumor origins. However, genomic testing is not conducted for every patient and lacks clinical penetration in low resource settings. Herein, to overcome these challenges, we present a deep learning-based computational pathology algorithm-TOAD-that can provide a differential diagnosis for CUP using routinely acquired histology slides. We used 17,486 gigapixel whole slide images with known primaries spread over 18 common origins to train a multi-task deep model to simultaneously identify the tumor as primary or metastatic and predict its site of origin. We tested our model on an internal test set of 4,932 cases with known primaries and achieved a top-1 accuracy of 0.84, a top-3 accuracy of 0.94 while on our external test set of 662 cases from 202 different hospitals, it achieved a top-1 and top-3 accuracy of 0.79 and 0.93 respectively. We further curated a dataset of 717 CUP cases from 151 different medical centers and identified a subset of 290 cases for which a differential diagnosis was assigned. Our model predictions resulted in concordance for 50% of cases (\k{appa}=0.4 when adjusted for agreement by chance) and a top-3 agreement of 75%. Our proposed method can be used as an assistive tool to assign differential diagnosis to complicated metastatic and CUP cases and could be used in conjunction with or in lieu of immunohistochemical analysis and extensive diagnostic work-ups to reduce the occurrence of CUP.

Citations (431)

Summary

  • The paper introduces the TOAD algorithm, achieving 84% top-1 and 94% top-3 accuracy in predicting primary tumor origins using deep learning.
  • It employs an attention-based multiple instance learning framework with deep residual CNNs, effectively handling 17,486 whole-slide images without manual annotations.
  • The study demonstrates clinical potential by attaining 75% top-3 agreement on CUP cases across 202 hospitals, suggesting its role as a valuable diagnostic aid.

Analysis of Deep Learning-based Computational Pathology for Predicting Tumor Origins

This paper presents an innovative approach leveraging deep learning (DL) to address the complex issue of diagnosing cancers of unknown primary (CUP) using histopathology whole-slide images (WSIs), a routine diagnostic tool in pathology. The authors introduce the Tumor Origin Assessment via Deep-learning (TOAD) algorithm, showcasing its potential as a substitute or complement to immunohistochemistry and other extensive diagnostic procedures typically necessary in determining primary tumor origins for CUP cases.

The paper harnessed 17,486 gigapixel WSI samples representing 18 common cancer origins to train the TOAD model. The comprehensive dataset allowed the development of a multi-task DL model that concurrently identifies whether the tumor is primary or metastatic and predicts its site of origin. The paper reports a commendable top-1 accuracy of 84% and a top-3 accuracy of 94% on the internal test set, with an external test set yielding a top-1 accuracy of 79% and a top-3 accuracy of 93%. The external validation, featuring data from 202 different hospitals, underscores the generalizability of the model across diverse clinical settings.

The paper further evaluates the TOAD algorithm on a curated dataset of 717 CUP cases, where the model achieved a 50% concordance rate with differentials assigned post extensive work-ups in 290 selected cases. Despite a moderate kappa score of 0.4 indicating fair agreement, the model demonstrated a top-3 agreement in 75% of the cases—a promising result for CUP diagnosis.

Technical Aspects and Methodology

The novel aspect of this research is its application of an attention-based multiple instance learning (MIL) framework within a weakly-supervised multi-task learning setting. This strategy bypasses the need for manual annotation, which is both labor-intensive and time-consuming, allowing the model to learn directly from slide-level labels. The use of attention mechanisms enables the model to focus on diagnostically relevant regions, thereby enhancing prediction accuracy.

The inclusion of transfer learning and the aggregation of features extracted from a deep residual CNN facilitates efficient model training across a considerable volume of data. Additionally, by integrating patient gender as a covariate, the model's predictive fidelity is augmented, especially in distinguishing between primary versus metastatic tumors.

Implications and Future Directions

The implications of this paper for clinical practice are substantial. TOAD shows potential not only as an assistive diagnostic tool for pathologists but also as a standalone system that might reduce the reliance on more resource-intensive diagnostic procedures. Its adaptability across different healthcare systems, hinted at by the external validation results, points towards broader applicability, particularly in lower-resource settings where genomic testing penetration is limited.

Theoretically, the paper advances the implementation of attention-based DL frameworks in computational pathology. The model's ability to interpret outcomes through attention heatmaps enhances its utility, providing pathologists with visual interpretations aligned with model predictions and enabling further validation.

Future research could involve enhancing the model’s interpretability and integration with genomic data to improve prediction accuracy further. Exploring the model's applicability across other cancer subtypes and metastatic localizations might also uncover additional clinical utilities. As computational pathology continues to integrate with AI technologies, the methodologies developed in this paper could serve as a foundational blueprint for similar innovations.

In summary, this paper exemplifies the integration of DL in pathology for CUP diagnosis—a step forward in personalized cancer treatment and an intriguing subject for ongoing AI research in healthcare.