Sampling Matters in Deep Embedding Learning (1706.07567v2)

Published 23 Jun 2017 in cs.CV

Abstract: Deep embeddings answer one simple question: How similar are two images? Learning these embeddings is the bedrock of verification, zero-shot learning, and visual search. The most prominent approaches optimize a deep convolutional network with a suitable loss function, such as contrastive loss or triplet loss. While a rich line of work focuses solely on the loss functions, we show in this paper that selecting training examples plays an equally important role. We propose distance weighted sampling, which selects more informative and stable examples than traditional approaches. In addition, we show that a simple margin based loss is sufficient to outperform all other loss functions. We evaluate our approach on the Stanford Online Products, CAR196, and the CUB200-2011 datasets for image retrieval and clustering, and on the LFW dataset for face verification. Our method achieves state-of-the-art performance on all of them.

Citations (898)

View on Semantic Scholar

Summary

The paper demonstrates that sampling strategies, notably distance weighted sampling, play a critical role in enhancing deep embedding performance.
The paper introduces a margin-based loss that relaxes strict zero-distance requirements, ensuring robust separation of positive and negative samples.
The novel methods achieve state-of-the-art Recall@1 and accuracy on datasets such as Stanford Online Products and LFW, showcasing strong practical impact.

Sampling Matters in Deep Embedding Learning

This paper presents a thorough investigation into the impact of sampling strategies on deep embedding learning. Deep embeddings are fundamental in many computer vision tasks, including verification, zero-shot learning, and visual search. Traditionally, most research has focused on the development and refinement of loss functions, such as contrastive loss and triplet loss. However, this paper highlights that the selection of training examples—referred to as sampling—plays an equally critical role.

Main Contributions

Distance Weighted Sampling (DWS):
- The authors introduce a novel sampling strategy called Distance Weighted Sampling. Unlike traditional approaches that either randomly select samples or use semi-hard negative mining, DWS assigns selection probabilities based on the distances between examples. The objective is to ensure a more informative and diverse set of examples for training, thereby stabilizing and improving the learning process.
Margin Based Loss:
- A new margin-based loss function is proposed. This loss function relaxes the constraints imposed by the traditional contrastive loss, where all positive pairs are forced to have zero distance, by only requiring that positive samples be separated by a margin from negative samples. The margin-based loss depends on a learned boundary parameter, $\beta$ , making it more robust and adaptable.

Theoretical Insights

The paper provides theoretical insights into why sampling strategies significantly impact the performance of embedding models. The core idea is that in high-dimensional spaces, random negative samples tend to be equidistant from the anchor sample. This results in many samples being non-informative for training. Distance Weighted Sampling overcomes this by ensuring a diverse spread of sampling distances, reducing the bias towards less informative samples, and ensuring lower variance in gradient updates.

Strong Numerical Results

The proposed methods are evaluated on four datasets: Stanford Online Products, CARS196, CUB200-2011, and LFW. The main findings are:

Stanford Online Products: The margin-based loss with distance-weighted sampling achieved the highest Recall@1 of 72.7%, significantly outperforming other state-of-the-art methods.
CARS196 and CUB200-2011: Consistent improvements were observed, with the margin-based loss achieving Recall@1 scores of 86.9% and 63.9%, respectively.
LFW: For face verification, the proposed method achieved an accuracy of 98.37%, surpassing other methods trained on the same CASIA-WebFace dataset.

Implications and Future Work

Practical Implications

Practically, the combined use of distance weighted sampling and margin-based loss can be applied to a wide range of image retrieval, clustering, and verification tasks. The improved performance metrics suggest that these methods make deep embedding models more robust and effective, especially in tasks with high intra-class variability.

Theoretical Implications

Theoretically, this work shifts some focus from solely improving loss functions to also considering how sample selection can alter learning dynamics. This opens up new avenues for research into other sampling methods and their theoretical underpinnings.

Possible Future Developments

Future research could explore:

Generalizing distance weighted sampling and loss formulations to other modalities beyond images, such as text or multimodal data.
Investigating the impact of different network architectures when combined with these sampling strategies.
Extending the algorithm to handle out-of-distribution samples, ensuring better robustness in real-world applications.
Exploring adaptive sampling strategies that dynamically adjust the sampling probability during training.

Conclusion

In conclusion, this paper convincingly argues that sampling strategies are as crucial as the choice of loss function in deep embedding learning. By introducing Distance Weighted Sampling and a robust margin-based loss function, the authors provide a methodological enhancement that outperforms existing state-of-the-art techniques across several benchmarks. This work not only offers practical tools for improving embedding models but also paves the way for a more nuanced understanding of the role of sampling in deep learning.

PDF Markdown