Papers
Topics
Authors
Recent
Search
2000 character limit reached

Zero-Shot Disease Classification

Updated 6 February 2026
  • Zero-shot disease classification is a clinical AI method that diagnoses diseases without prior labeled examples by leveraging semantic and multimodal representations.
  • It utilizes dual encoder architectures, prompt engineering, and generative feature synthesis to bridge gaps between seen and unseen disease categories.
  • Evaluation metrics like AUROC and F1 assess performance while addressing challenges such as class imbalance, calibration, and domain shifts in medical imaging.

Zero-shot disease classification is a subfield of clinical artificial intelligence in which a system identifies, classifies, or diagnoses diseases for which it has never seen labeled training examples. Addressing this problem is critical for medical imaging, electronic health records, and other biomedical applications, where the label space is vast and real-world disease distributions are highly long-tailed. Traditional supervised models are inherently limited in their generalization: they can only predict classes present in the training data, whereas zero-shot learning (ZSL) approaches leverage auxiliary semantic knowledge, modality alignment, or generative mechanisms to enable inference on previously unseen disease categories.

1. Conceptual Foundations and Task Setting

Zero-shot disease classification seeks to bridge the gap between supervised deep learning and the open-world challenge of unknown or rare disease entities. Given an input (e.g., a chest X-ray, pathology WSI, clinical text), a zero-shot classifier is required to predict the most relevant disease label(s) from an expanded label set, which includes both "seen" classes (those present in training) and "unseen" classes (those reserved for zero-shot evaluation) (Hayat et al., 2021, Lin et al., 9 Jun 2025).

Key ingredients enabling zero-shot performance include semantic representations of disease classes (e.g., vector embeddings derived from medical text corpora by BioBERT, GPT-4, or curated clinical attributes), architectural designs for cross-modality alignment (e.g., visual features with semantic features), and learning objectives that encourage a model to generalize beyond the closed set. Formal evaluation typically distinguishes between "conventional ZSL" (test only on unseen classes) and the more clinically realistic "generalized ZSL" (test on both seen and unseen with separation quantified by harmonic mean metrics).

2. Architectures and Methodological Innovations

A variety of architectures underpin state-of-the-art zero-shot disease classification systems across modalities.

Vision-Language Alignment: The dominant paradigm for image-based ZSL uses dual encoders to map images and disease descriptions into a shared latent space, with predictions made by measuring similarity (e.g., cosine) between visual and textual embeddings. Notable instantiations include:

  • CXR-ML-GZSL maps DenseNet-121 visual features and BioBERT disease embeddings into a 128-dimensional joint space, training with ranking, alignment, and semantic-consistency losses (Hayat et al., 2021).
  • CLIP-based frameworks employ large-scale pre-trained vision(transformer)-language(transformer) backbones and zero-shot prompt engineering to match images to prompt-encoded diseases (Benabbas et al., 24 Nov 2025, Liu et al., 2023).
  • CARZero enhances image-text alignment by replacing pooled vector similarity with cross-attention between local image patches and word/sentence-level text features, improving fine-grained matching for rare diseases (Lai et al., 2024).

Prompt Engineering: Effective performance depends on constructing semantically rich, clinically relevant prompts for disease labels. Empirical studies demonstrate that LLM-generated or human-curated prompts that encapsulate specific visual/semantic features outperform naive class names, especially for rare diseases or in long-tailed distributions (Liu et al., 2023, Lin et al., 9 Jun 2025).

Patch-based & Hybrid Models: Histopathology and volumetric imaging modalities require multi-resolution and multi-instance approaches (e.g., hybrid fusion of global and local features, attention-weighted patch embeddings, montage construction) to capture diagnostic context critical for zero-shot generalization (Rahaman et al., 13 Mar 2025, Uden et al., 2023).

Generative Feature Synthesis: Some frameworks address the inability to observe unseen-class samples by synthesizing latent features conditioned on semantic descriptors, either via conditional Wasserstein GANs with auxiliary losses (attribute consistency, hierarchy, or keyword reconstruction) or adversarial learning (Song et al., 2019, Mahapatra, 2022).

Similarity Retrieval and Clustering: Hybrid systems (e.g., RURA-Net) use Siamese networks for disease similarity retrieval, lesion segmentation with U-Nets, and unsupervised clustering of deep features for pseudo-label assignment, achieving ZSL in the absence of explicit class supervision (su et al., 26 Feb 2025).

3. Loss Functions, Training, and Latent Space Regularization

Zero-shot generalization is sensitive to the geometry and structure of the learned latent space. Key loss and training constructs include:

  • Ranking Losses (LrankL_{\mathrm{rank}}): Ensure positive disease classes receive higher scores than negatives for multi-label settings (Hayat et al., 2021).
  • Alignment and Consistency Losses: Encourage image embeddings to be closely aligned (cosine) with semantic class prototypes; force projected class embeddings to preserve their relative structure (Hayat et al., 2021, Lin et al., 9 Jun 2025).
  • Contrastive Losses (InfoNCE): Jointly maximize similarity between matched image and text pairs while minimizing for mismatched pairs (Benabbas et al., 24 Nov 2025, Uden et al., 2023).
  • Class-weighting and Clustering: Gaussian Mixture Model and Student’s t-distribution clustering, followed by triplet loss and class-weighted objectives, considerably improve performance for rare, long-tailed classes (Madhipati et al., 25 Jul 2025).
  • Generative/Adversarial Losses: Conditional WGAN-GP, cycle consistency, and keyword-based reconstruction for synthesizing unseen-class representations (Song et al., 2019, Mahapatra, 2022).

Empirically, inclusion of alignment and semantic consistency (beyond vanilla ranking/contrastive objectives) robustly boosts unseen-class recall without sacrificing performance on seen classes (Hayat et al., 2021, Madhipati et al., 25 Jul 2025). Fine-tuning both visual and text encoders or leveraging domain-adaptive pretraining can further improve domain transfer for scarce or emerging diseases (Uden et al., 2023).

4. Evaluation Protocols and Benchmark Results

Evaluation protocols for zero-shot disease classification are built to measure both overall classification accuracy and the model’s ability to maintain sensitivity on rare/unseen disease classes. Commonly reported metrics include:

Experimental results highlight that vision-language ZSL models typically achieve substantial gains over both naive baselines and few-shot approaches on rare classes, although the absolute mAP or AUROC on strictly "zero-shot" test classes remains lower than for seen cases (e.g., CXR-LT 2024 Task 3, mAP5_5 of 0.129–0.13 vs long-tail supervised baseline at 0.136) (Lin et al., 9 Jun 2025). Across diverse domains (chest X-ray, fundus, pathology, plant disease, EHRs), techniques that attentionally align or cluster embeddings, or that leverage multi-cue prompt aggregation, consistently achieve state-of-the-art zero-shot classification, especially under domain shift (Rahaman et al., 13 Mar 2025, Benabbas et al., 24 Nov 2025, Liu et al., 2023).

5. Applications, Impact, and Extensibility

Zero-shot disease classification is applicable in multiple healthcare contexts:

  • Medical imaging triage and discovery: Rapid adaptation to emergent diseases, rare condition detection, and open-world triage without retraining (Lin et al., 9 Jun 2025, Rahman et al., 2024).
  • Computational pathology: Multi-resolution prompt-guided ZSL approaches enable histological subtyping and tumor vs. benign discrimination without manual annotations (Rahaman et al., 13 Mar 2025).
  • Electronic health records (EHR) phenotyping: Frameworks such as LLM-based MapReduce pipelines support cohort discovery and rare disease case finding, outperforming hand-crafted rules (Thompson et al., 2023).
  • Plant and agricultural diagnostics: CLIP-based ZSL closes the gap between curated datasets and field deployment for plant disease classification under domain shift (Benabbas et al., 24 Nov 2025).
  • Clinical text and code assignment: Generalized ZSL generators for ICD or MeSH coding enhance recognition of rare or never-before-annotated diagnosis codes (Song et al., 2019, Lupart et al., 2022).

Extensibility to multi-modal settings (e.g., combining imaging, text reports, and structured labs), as well as refinement for interpretability through attention or saliency mapping, has been demonstrated for clinical robustness and transparency (Liu et al., 2023, Lai et al., 2024).

6. Limitations, Challenges, and Open Research Directions

Despite progress, challenges in zero-shot disease classification persist:

  • Extreme class imbalance and calibration: Extremely low prevalence of some diseases leads to miscalibration and high false positive rates, especially where positive samples are rare even at test time (Lin et al., 9 Jun 2025, Madhipati et al., 25 Jul 2025).
  • Semantic and visual domain shift: Variability in disease descriptions, synonym usage, and differing imaging artifacts can impact robustness; crafting generalized prompts or learning robust mappings remains nontrivial (Lin et al., 9 Jun 2025).
  • Interpretability and reliability: Although attention or feature alignment can support explainability, wide adoption in clinical workflows requires further validation and integration with radiologist review or human-in-the-loop systems (Liu et al., 2023, Lai et al., 2024).
  • Scalability and annotation efficiency: Many methods (e.g., RURA-Net, generative ZSL) still require segmentation datasets, curated prompts, or keyword mining, which can introduce additional annotation or domain adaptation burdens (su et al., 26 Feb 2025, Song et al., 2019).
  • Generalization to new organs and modalities: Cross-disease transferability has been demonstrated primarily within similar organs (e.g., lung X-rays); robust generalization to other organs, multi-class tasks, or non-imaging data remains a frontier (Rahman et al., 2024).

Recommendations from leading challenges and ablation studies suggest incorporating external knowledge graphs, region-proposal supervision, adaptive per-class decision thresholds, and integration of dense retrieval or chain-of-thought LLM prompting strategies to further elevate ZSL performance.

7. Representative Frameworks and Comparative Analysis

Framework Modality Core Mechanism Highlighted Metric Unseen Class Performance
CXR-ML-GZSL (Hayat et al., 2021) Chest X-ray Visual-semantic joint latent mapping AUROC (U=0.66, H=0.72) +22% AUROC over baseline
CARZero (Lai et al., 2024) Chest X-ray Cross-attention alignment AUC on rare diseases (+0.10) 0.837 on PadChest20
MR-PHE (Rahaman et al., 13 Mar 2025) Histopathology Multi-res patch, hybrid fusion, prompt enrichment Delta F1 = +2–17 pts Outperforms fully sup. baseline
CXR-CML (Madhipati et al., 25 Jul 2025) Chest X-ray Weighted contrastive loss, GMM+t clustering AUC=0.720 (rare) +0.089 over CheXzero
RURA-Net (su et al., 26 Feb 2025) Ophthalmic (CFP) Siamese retrieval, U-Net, clustering F1=0.83, AUC=0.92 Beats most few-shot/one-shot
Plant CLIP (Benabbas et al., 24 Nov 2025) Plant leaf CLIP zero-shot, symptom prompt Macro F1=66.3% Robust to field domain shift
EHR LLM-RAG (Thompson et al., 2023) Clinical text (EHR) Retrieval-Augmented Generation, LLM MapReduce F1=0.75 (zero-shot PH) +21% over rule-based

Each method demonstrates that explicitly leveraging semantic alignment, clustering, domain-adaptive pretraining, or prompt engineering is critical to achieving clinically meaningful zero-shot disease classification performance while highlighting the remaining challenges for open-world diagnostic systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (14)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Zero-Shot Disease Classification.