DeepRare Models: Visual Saliency & Disease Diagnosis
- DeepRare models denote two AI contributions: unsupervised visual saliency mapping and agentic rare disease diagnosis, characterized by transparency and modularity.
- The visual attention model DR21 uses a six-stage unsupervised pipeline—from deep feature extraction to rarity thresholding—to generate traceable saliency maps with state-of-the-art performance.
- The DeepRare agentic system integrates LLMs with multi-agent orchestration for rare disease diagnosis, providing transparent, evidence-backed reasoning and validated clinical results.
DeepRare models denote two distinct research contributions in artificial intelligence: (1) the DeepRare pipeline for unsupervised visual attention modeling, and (2) the DeepRare agentic system for rare disease diagnosis with traceable reasoning. Both model families share a focus on transparency, modularity, and leveraging deep learning in non-standard, low-prevalence or surprise-driven tasks, but are otherwise unrelated in methodology or domain.
1. DeepRare for Visual Attention Modeling
The DeepRare family (notably DeepRare2019 [DR19] and DeepRare2021 [DR21]) targets the computational modeling of human visual attention, specifically the detection of rare, surprising, or unusual patterns in images. Human attention exhibits a bias towards visually salient, infrequent, or high-contrast features—phenomena that traditional deep neural networks (DNNs) generally fail to model, due to their tendency to focus on frequent, top-down targets such as faces or text (Kong et al., 2021).
1.1 DR21 Pipeline Structure
DR21 implements a six-stage process [(Kong et al., 2021), Sec. 2, Fig. 1]:
- Deep Feature Extraction: Images are processed by a pretrained CNN (e.g., VGG16, VGG19, MobileNetV2), omitting pooling and fully connected layers, to generate multi-scale, multi-channel feature maps without any fine-tuning [Sec. 2.1].
- Rarity Map Computation: For each feature map , histograms over activations are constructed. The rarity at each bin is defined as with the probability of bin [Sec. 2.2]. The rarity value is spatially back-projected to the pixels whose activations fall in .
- Rarity Thresholding: DR21 applies per-feature rarity thresholds (e.g., for the top 10% rarest activations) to generate binary masks , a major improvement over DR19 which lacked thresholding [Sec. 2.3].
- Data Fusion: Rarities are combined per layer using Itti & Koch's global normalization weighting . Layer maps are regrouped into five deep-group conspicuity maps (DGCMs), and finally summed with an optional face map for the raw saliency map [Sec. 2.5].
- Post-processing: is Gaussian smoothed and squared to sharpen saliency peaks, yielding the final saliency map [Sec. 2.6].
- Interpretability: By varying and grouping layer-wise maps, practitioners obtain insight into which scales and features drive saliency responses in each image [Sec. 2.3, Sec. 4].
DR21's design is fully unsupervised: it requires no retraining and uses only off-the-shelf ImageNet weights.
1.2 Quantitative Performance and Implementation
DR21 variants achieve state-of-the-art or near-best performance across four heterogeneous eye-tracking and psychophysical datasets, with metrics including AUC-Judd, CC, KL, NSS, and SIM:
| Dataset (Backbone) | Notable Metrics | Benchmark Comparison |
|---|---|---|
| MIT1003 (VGG19) | AUCJ=0.86, AUCB=0.85, CC=0.56, KL=0.88 | Outperforms classical and deep-feature models, matches/bests MLNet |
| OSIE (VGG16) | AUCJ=0.87, CC=0.59, NSS=2.06, SIM=0.52 | Surpassed only by SAM-ResNet and DR21+TD, rivals FAPTTX |
| O³ (VGG16) | MSRₜ=1.45, MSR_b=1.01 | Beats SAM-ResNet, BMS, CVS for target/distractor discrimination |
| P³ (VGG16) | Avg. #fix=13.5, %found=89%, GSIavg=0.59 | Fastest & most biologically plausible Global Saliency Index curves |
Inference is efficient (<1 s/image) on CPUs. The DR21 codebase is implemented in Python + Keras and supports backbone modularity for extension to other CNNs [(Kong et al., 2021), Abstract].
1.3 Transparency and Limitations
Multi-threshold rarity mapping provides explainable saliency cues traceable to semantic scales; failure cases can be diagnosed by inspecting per-layer conspicuity maps. This level of introspection is not attainable with purely end-to-end supervised DNN saliency models [Sec. 2.3, 4]. A plausible implication is that DR21 is preferable in circumstances demanding transparency and diagnostic insight.
2. DeepRare Agentic System for Rare Disease Diagnosis
DeepRare (Zhao et al., 25 Jun 2025) refers to an agentic platform employing LLMs and multi-agent orchestration for rare disease diagnosis from heterogeneous clinical inputs. The system is engineered for transparent, traceable reasoning that links all diagnostic hypotheses to verifiable medical evidence.
2.1 System Architecture
DeepRare utilizes a three-tier architecture:
- Tier 1: Central Host + Long-Term Memory
- LLM-based orchestrator manages workflow, storing raw inputs and evidence in a memory bank .
- Tier 2: Specialized Agent Servers
- Agent modules perform domain-specific analysis: phenotype extraction, literature search, case retrieval, genotype analysis.
- Tier 3: External Data Sources
- Interfaces to PubMed, Crossref, OMIM, Orphanet, HPO, curated case databases, and web-scale resources [(Zhao et al., 25 Jun 2025), Sec. 1].
The Central Host performs structured query aggregation and controls agent interactions, iteratively accumulating evidence.
2.2 Diagnostic Reasoning Pipeline
DeepRare alternates between two phases:
- Information Collection: Maps clinical text to HPO terms (), retrieves relevant literature and cases, applies phenotype and genotype analyses to generate preliminary disease hypotheses .
- Self-Reflective Diagnosis: Standardizes disease candidates to ontological IDs, performs further literature retrieval, self-reflects to prune unsupported hypotheses, and generates transparent reasoning chains for each final diagnosis [Sec. 2].
The system's reasoning is formalized as a score-weighted ensemble: with constraints , . Candidates with fewer than citations in are removed [Sec. 3].
2.3 Tool Integration and Knowledge Management
DeepRare integrates over 40 tools, including phenotype normalizers (PhenoBrain, PubCaseFinder, BioLORD), genotype pipelines (Exomiser, SIFT, PolyPhen, MutationTaster), search engines (web and specialized), and knowledge bases (OMIM, Orphanet, HPO). The ensemble process orchestrated by the Central Host allows up-to-date, verifiable insights grounded in both contemporary and archival evidence [Sec. 4].
2.4 Performance and Validation
System performance is validated across eight datasets (6,401 cases, 2,919 diseases), with Recall@1, @3, and @5 as key metrics:
| Method | HPO-Only Recall@1 (%) | Multi-Modal Recall@1 (%) |
|---|---|---|
| Top Traditional | 26.8 | - |
| Best General LLM | 34.8 | - |
| Best Reasoning LLM | 33.4 | - |
| DeepRare (GPT-4o) | 55.6 | - |
| DeepRare (DeepSeek) | 57.2 | 70.6 (vs Exomiser: 53.2) |
LLM-based ranking was validated (Pearson ) against eight senior physicians. Reasoning chain accuracy was confirmed by clinical experts at 95.4% agreement [Sec. 5].
2.5 Web Application and Scalability
The DeepRare web platform (http://raredx.cn/doctor) offers a five-stage guided diagnostic workflow, supporting asynchronous, parallelized case analysis and sub-minute turnaround. Data locality guarantees privacy compliance for in-house cohorts [Sec. 6].
3. Comparative Insights and Context
The two DeepRare models occupy disparate scientific domains: attention modeling and clinical decision support, respectively. Both emphasize:
- Modularity: Both model families allow for plug-and-play specialization (CNN backbones in DR21; tool agent addition in diagnostic DeepRare).
- Transparency: Rarity-thresholding in DR21 provides saliency map introspection; citation-linked reasoning chains in diagnostic DeepRare ensure traceability.
- Unsupervised or Non-task-specific Use of Pretraining: DR21 relies on fixed ImageNet weights; diagnostic DeepRare is powered by pretrained LLMs augmented with real-time data retrieval.
A plausible implication is that the "DeepRare" label has become associated with architectures prioritizing transparency and modular evidence fusion in domains where rare events are critical.
4. Implementation and Open Resources
DR21 is available in Python + Keras and supports rapid (<1 s/image) inference on CPU-only infrastructure [(Kong et al., 2021), Abstract]. DeepRare for rare disease diagnosis is accessible as a web deployment, with core system components outlined in (Zhao et al., 25 Jun 2025). Both codebases and APIs are positioned for extension and integration with alternative model backbones or data sources.
5. Limitations and Future Directions
DR21, despite its explainability, remains limited by the feature diversity present in its CNN backbone and by the histogram-based rarity estimation. Diagnostic DeepRare's reliance on LLMs and external APIs renders it sensitive to the scope of indexed knowledge and recall of LLMs.
Potential extensions for visual attention include backbone expansion to modern architectures (ResNet, EfficientNet) and dynamic rarity binning. For rare disease diagnosis, expansion to broader multi-modalities and continual learning from fresh clinical cases represent natural avenues. These directions suggest a continued emphasis on modularity, explainability, and the principled integration of external knowledge sources for rare-event modeling (Kong et al., 2021, Zhao et al., 25 Jun 2025).