Dice Question Streamline Icon: https://streamlinehq.com

Open-domain CIR datasets with multiple ground-truth targets and reduced false negatives

Construct an open-domain composed image retrieval dataset that provides multiple ground-truth target images per query while mitigating false negatives, in order to better reflect real-world many-to-many query–target correspondences and support reliable evaluation of composed image retrieval models.

Information Square Streamline Icon: https://streamlinehq.com

Background

Most existing composed image retrieval (CIR) datasets are limited to specific domains (e.g., fashion or birds) and typically annotate only one target image per query, despite the fact that multiple images can satisfy the same multimodal query. This one-to-one annotation strategy introduces false negatives that can mislead training and evaluation.

Although the CIRR dataset attempts to reduce false negatives via subset-based annotation, it still provides only a single ground-truth target per query. CIRCO includes multiple targets per query but is designed primarily for zero-shot evaluation and remains relatively limited in scope for broader supervised CIR benchmarking.

The authors identify the need for open-domain datasets that both offer multiple ground-truth targets per query and explicitly mitigate false negatives as a key unresolved challenge to enable more realistic and fair evaluation of CIR methods.

References

Creating open-domain CIR datasets with multiple ground-truth target images while mitigating false negatives remains an open challenge requiring further exploration.

A Comprehensive Survey on Composed Image Retrieval (2502.18495 - Song et al., 19 Feb 2025) in Section 6.1, Supervised Composed Image Retrieval – Benchmark Dataset Construction