CTRATE-IR: Anatomy-Aware CT Retrieval Dataset
- CTRATE-IR is a multi-granularity dataset constructed by mining radiology reports to annotate chest CT images with detailed, anatomy-specific descriptors.
- It leverages advanced NLP techniques to extract, link, and hierarchically aggregate regional findings from 25,692 CT volume–report pairs, producing over 1.32×10^11 similarity scores.
- The framework supports both global and anatomy-conditioned retrieval tasks, enhancing diagnostic comparability and providing robust benchmarks for image similarity evaluation.
CTRATE-IR refers principally to an anatomy-aware, large-scale dataset for conditional medical image retrieval based on chest computed tomography (CT), constructed via automatic radiology report mining as described in "RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining" (Zhang et al., 6 Mar 2025). The acronym also appears in an astronomical context as shorthand for "cosmic star-formation and black-hole accretion histories" measured in the infrared, as introduced in the SPICA study of cosmic evolution (Spinoglio et al., 2017). Both uses denote comprehensive, structure-conditioned measurements or annotations: in biomedical informatics for medical image–image similarity, and in extragalactic astrophysics for the obscured history of key cosmic processes. This entry treats the principal data structure and methodology of CTRATE-IR in the biomedical domain, with cross-references to the astronomical term.
1. Construction of the CTRATE-IR Dataset
CTRATE-IR is derived from the CT-RATE chest CT corpus, comprising 25,692 non-contrast chest CT volume–radiology report pairs (Zhang et al., 6 Mar 2025). Each report includes a "Findings" section written in free-text, detailing observations on regional (anatomy-specific) pathology. The dataset construction incorporates both linguistic and anatomical decomposition:
- Anatomical entity extraction: RadGraph-XL is used to identify 90 high-frequency anatomical entities (e.g., "lungs," "aorta"), including the resolution of synonymy and explicit encoding of parent–child hierarchies (e.g., "lungs" encompasses "left lung" and "right lung").
- Report–region linking: The Findings section is split at sentence boundaries, and each sentence is algorithmically linked to all anatomies mentioned therein via rule-based string matching.
- Regional aggregation: Substructure findings are recursively aggregated into their anatomical parents, producing, per CT volume, a set of regional textual descriptors for each anatomy Q.
This approach enables the construction of anatomy-conditioned, multi-granularity annotation for each image, ultimately permitting fine-grained relevance judgments across 1.32 × 1011 image–image pairs.
2. Automatic Similarity Ordering and Proxy Labeling
The core innovation of CTRATE-IR is its scalable, fully-automatic generation of multi-granularity similarity orderings for image retrieval, leveraging dense natural-language annotations. For a given query image and anatomy , the system performs these steps:
- Regional finding extraction: Retrieve the finding snippets and, for each candidate , .
- Textual similarity computation: Compute the RaTEScore to serve as a proxy measure of anatomy-specific similarity between and .
- Consistency assumption: The image–image similarity ranking on anatomy is defined to be identical to the report–report similarity ranking on the same region, i.e.,
- Global ranking (for ) uses the full report.
- Region-specific ranking conditions explicitly on Q.
In this framework, "ground-truth" relevance for any image–image pair is induced from the associated regional text similarity.
3. Dataset Statistics and Splits
| Parameter | Value | Notes |
|---|---|---|
| Total CT volumes | 25,692 | Split per official CT-RATE definitions |
| Anatomical entities | 90 | Hierarchized, synonym-resolved |
| Regional findings | 2,582,477 | Snippets linked to anatomical labels |
| Fine-grained similarity | ≈1.32 × 1011 scores | Comprehensive pairwise annotation |
Each CT study is annotated at multiple anatomical levels, with train/val/test partitions following the CT-RATE release.
4. Retrieval Tasks and Evaluation Metrics
CTRATE-IR supports several retrieval workflows:
- Image→Image (global): Retrieve CT volumes based on full-image similarity to .
- Image→Report: Retrieve reports ranked by similarity to query image .
- Anatomy-conditioned Image→Image: For (), rank all volumes by similarity of region .
Evaluation metrics are standardized:
- Recall@K: Proportion of relevant items (with ) amongst the top K.
- Mean Average Precision (mAP): Averaged over queries, AP combines precision at each cutoff, weighted by exact relevance.
- DCG@K / NDCG@K: Assessments accounting for the graded relevance of ranked items.
Conditional retrieval metrics use as anatomy-specific ground-truth; global metrics use .
5. Experimental Results and Comparative Performance
RadIR-ChestCT, a dual-stage retrieval architecture, was evaluated on CTRATE-IR. Stage 1 handles global retrieval (using a ViT vision encoder and BERT text encoder); stage 2 fuses anatomy input for conditional retrieval, all with masked InfoNCE losses and RaTEScore-based targets.
Key results:
- Global Image→Image (RadIR-ChestCT): R@5 = 20.75%, R@100 = 72.80%, NDCG@5 = 74.60%.
- Global Image→Report: Significant gains over CT-CLIP; R@5 = 6.65%, R@100 = 52.91%.
- Anatomy-conditioned retrieval: RadIR-ChestCT outperforms baselines, with average R@3 = 55.23% (vs. 43.85% for CT-CLIP). Gains are more pronounced for rare anatomies, e.g., gallbladder (R@5: 42.70% vs. 25.84%).
The computational infrastructure includes on-the-fly regional similarity matrix computation and successive global-to-conditional training.
6. Methodological Significance and Domain Context
CTRATE-IR represents a scalable methodology for structured medical image retrieval dataset construction, addressing the chronic shortage of high-quality image–image similarity datasets in radiology. By leveraging dense, semi-structured report data and establishing proxy similarity ground-truth at multiple anatomical scales, the dataset enables the benchmarking and advancement of anatomy-aware retrieval algorithms.
The consistent use of linguistically-grounded, region-labeled relevance—without manual pairwise labeling—allows the definition of fine-grained retrieval tasks that align with clinical diagnostic patterns, such as comparing images for pathologies confined to distinct anatomical regions.
7. "CTRATE-IR" in the Extragalactic Context
The term "C‐TRA{TE}‐IR" is also used as an abbreviated reference to the cosmic star-formation and black-hole accretion histories as measured in the infrared, particularly in the context of planned SPICA mission spectroscopic surveys (Spinoglio et al., 2017). In this usage, C-TRA{TE}-IR designates the unbiased, extinction-free history of these astrophysical rates over —distinguished from the biomedical dataset by context and scientific field.
References
- RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining (Zhang et al., 6 Mar 2025)
- Galaxy evolution studies with the SPace IR telescope for Cosmology and Astrophysics (SPICA): the power of IR spectroscopy (Spinoglio et al., 2017)