Papers
Topics
Authors
Recent
Search
2000 character limit reached

Deep Metric Learning for Computer Vision: A Brief Overview

Published 1 Dec 2023 in cs.CV, cs.AI, cs.IR, and cs.LG | (2312.10046v1)

Abstract: Objective functions that optimize deep neural networks play a vital role in creating an enhanced feature representation of the input data. Although cross-entropy-based loss formulations have been extensively used in a variety of supervised deep-learning applications, these methods tend to be less adequate when there is large intra-class variance and low inter-class variance in input data distribution. Deep Metric Learning seeks to develop methods that aim to measure the similarity between data samples by learning a representation function that maps these data samples into a representative embedding space. It leverages carefully designed sampling strategies and loss functions that aid in optimizing the generation of a discriminative embedding space even for distributions having low inter-class and high intra-class variances. In this chapter, we will provide an overview of recent progress in this area and discuss state-of-the-art Deep Metric Learning approaches.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. “Neighbourhood Components Analysis” In Advances in Neural Information Processing Systems 17 MIT Press, 2004 URL: https://proceedings.neurips.cc/paper/2004/file/42fe880812925e520249e808937738d2-Paper.pdf
  2. Florian Schroff, Dmitry Kalenichenko and James Philbin “Facenet: A unified embedding for face recognition and clustering” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815–823
  3. Kihyuk Sohn “Improved deep metric learning with multi-class n-pair loss objective” In Advances in neural information processing systems 29, 2016
  4. “No fuss distance metric learning using proxies” In Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 360–368
  5. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” In CoRR abs/1810.04805, 2018 arXiv: http://arxiv.org/abs/1810.04805
  6. “RoBERTa: A Robustly Optimized BERT Pretraining Approach” In CoRR abs/1907.11692, 2019 arXiv: http://arxiv.org/abs/1907.11692
  7. “Representation Learning Through Cross-Modality Supervision” In 2019 14th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019), 2019, pp. 1–8 DOI: 10.1109/FG.2019.8756519
  8. “Multi-similarity loss with general pair weighting for deep metric learning” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5022–5030
  9. “Proxy anchor loss for deep metric learning” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3238–3247
  10. “Moving in the Right Direction: A Regularization for Deep Metric Learning” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
  11. Eu Wern Teh, Terrance DeVries and Graham W Taylor “Proxynca++: Revisiting and revitalizing proxy neighborhood component analysis” In European Conference on Computer Vision, 2020, pp. 448–464 Springer
  12. “Universal weighting metric learning for cross-modal matching” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 13005–13014
  13. “Fewer is More: A Deep Graph Metric Learning Perspective Using Fewer Proxies” In Advances in Neural Information Processing Systems 33 Curran Associates, Inc., 2020, pp. 17792–17803 URL: https://proceedings.neurips.cc/paper/2020/file/ce016f59ecc2366a43e1c96a4774d167-Paper.pdf
  14. “Multi Loss Fusion For Matching Smartphone Captured Contactless Finger Images” In 2021 IEEE International Workshop on Information Forensics and Security (WIFS), 2021, pp. 1–6 DOI: 10.1109/WIFS53200.2021.9648393
  15. “Learning Transferable Visual Models From Natural Language Supervision” In CoRR abs/2103.00020, 2021 arXiv: https://arxiv.org/abs/2103.00020
  16. “RidgeBase: A Cross-Sensor Multi-Finger Contactless Fingerprint Dataset” In 2022 International Joint Conference on Biometrics (IJCB), 2022
  17. “Attribute De-biased Vision Transformer (AD-ViT) for Long-Term Person Re-identification” In 2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2022, pp. 1–8 DOI: 10.1109/AVSS56176.2022.9959509
  18. Karsten Roth, Oriol Vinyals and Zeynep Akata “Integrating Language Guidance into Vision-based Deep Metric Learning” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16177–16189
  19. “NAPReg: Nouns As Proxies Regularization for Semantically Aware Cross-Modal Embeddings” In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023
Citations (4)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.