Optimization of Rank Losses for Image Retrieval (2309.08250v1)

Published 15 Sep 2023 in cs.CV

Abstract: In image retrieval, standard evaluation metrics rely on score ranking, \eg average precision (AP), recall at k (R@k), normalized discounted cumulative gain (NDCG). In this work we introduce a general framework for robust and decomposable rank losses optimization. It addresses two major challenges for end-to-end training of deep neural networks with rank losses: non-differentiability and non-decomposability. Firstly we propose a general surrogate for ranking operator, SupRank, that is amenable to stochastic gradient descent. It provides an upperbound for rank losses and ensures robust training. Secondly, we use a simple yet effective loss function to reduce the decomposability gap between the averaged batch approximation of ranking losses and their values on the whole training set. We apply our framework to two standard metrics for image retrieval: AP and R@k. Additionally we apply our framework to hierarchical image retrieval. We introduce an extension of AP, the hierarchical average precision $\mathcal{H}$-AP, and optimize it as well as the NDCG. Finally we create the first hierarchical landmarks retrieval dataset. We use a semi-automatic pipeline to create hierarchical labels, extending the large scale Google Landmarks v2 dataset. The hierarchical dataset is publicly available at https://github.com/cvdfoundation/google-landmark. Code will be released at https://github.com/elias-ramzi/SupRank.

Citations (2)

View on Semantic Scholar

Summary

The paper introduces SupRank, a surrogate loss method that robustly optimizes rank losses in deep image retrieval.
It employs a decomposability objective to align batch and global rankings, enhancing AP and R@k metrics.
Experiments on benchmarks like SOP and iNaturalist demonstrate state-of-the-art performance and improved hierarchical retrieval.

Optimization of Rank Losses for Image Retrieval

The paper "Optimization of Rank Losses for Image Retrieval" addresses two fundamental challenges in optimizing rank losses for deep neural networks: non-differentiability and non-decomposability. The authors introduce a novel approach called SupRank that provides a robust surrogate for ranking operators, enabling efficient optimization through stochastic gradient descent (SGD). SupRank addresses the complexity of rank loss optimization by providing an upper bound for rank losses, ensuring robust training even with the inherent non-differentiability of ranking metrics.

Contributions and Methodology

The paper contributes to the field of computer vision by introducing a unified framework for optimizing rank losses in image retrieval, specifically targeting the Average Precision (AP) and Recall at k (R@k) metrics. The approach mitigates the decomposability gap that arises due to the discrepancy between batch-wise and global rank loss computations. This gap is reduced through a decomposability objective that calibrates the scores across batches, allowing the surrogate loss to effectively approximate the true ranking metric over the entire dataset.

The authors apply their framework to both standard and hierarchical image retrieval tasks. In the hierarchical context, they propose a novel metric called Hierarchical Average Precision (HAP), extending the AP to accommodate non-binary relevance labels. This extension allows for a more nuanced evaluation of image retrieval systems, considering varying degrees of similarity between images beyond binary labels.

Key Numerical Results

The effectiveness of the proposed method is validated through extensive experiments on several benchmark datasets, including SOP, iNaturalist, and the newly introduced hierarchical landmark dataset, $\mathcal{H}$ -GLDv2. The framework achieves state-of-the-art performance on these datasets, demonstrating significant improvements in both fine-grained and hierarchical retrieval metrics.

For instance, the ROADMAP method, which is part of the framework, outperforms existing AP optimization approaches, showing a substantial 1.0-1.5 point increase in retrieval metrics compared to Smooth-AP and Fast-AP. In hierarchical retrieval, HAPPIER, another method introduced in the framework, shows a significant improvement over existing hierarchical methods, including CSL, particularly in metrics sensitive to hierarchical errors.

Implications and Future Directions

The implications of this research are substantial for both practical applications and theoretical advancements in AI. The proposed framework offers a viable solution for training deep learning models on ranking tasks, reflecting the growing importance of such models in applications ranging from image retrieval to other domains requiring ordered outputs.

The introduction of a hierarchical dataset for landmarks retrieval, $\mathcal{H}$ -GLDv2, also opens avenues for future research. This dataset can facilitate further exploration into hierarchical learning methods, potentially benefiting a variety of image-based applications such as autonomous driving and cultural heritage digitization.

Moreover, the paper suggests that future developments could explore adaptive hierarchical relevance functions and further refinements in decomposability objectives. These advancements could lead to improved training efficiency and model robustness, enhancing the performance of image retrieval systems in increasingly complex and large-scale environments.

In summary, the paper provides a comprehensive approach to optimizing rank losses for image retrieval, addressing long-standing challenges in the field and paving the way for more nuanced and robust image retrieval systems.

PDF Markdown

Related Papers

GitHub

GitHub - cvdfoundation/google-landmark: Dataset with 5 million images depicting human-made and natural landmarks spanning 200 thousand classes. (730 stars)