Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Metric Learning with Soft Orthogonal Proxies (2306.13055v1)

Published 22 Jun 2023 in cs.CV

Abstract: Deep Metric Learning (DML) models rely on strong representations and similarity-based measures with specific loss functions. Proxy-based losses have shown great performance compared to pair-based losses in terms of convergence speed. However, proxies that are assigned to different classes may end up being closely located in the embedding space and hence having a hard time to distinguish between positive and negative items. Alternatively, they may become highly correlated and hence provide redundant information with the model. To address these issues, we propose a novel approach that introduces Soft Orthogonality (SO) constraint on proxies. The constraint ensures the proxies to be as orthogonal as possible and hence control their positions in the embedding space. Our approach leverages Data-Efficient Image Transformer (DeiT) as an encoder to extract contextual features from images along with a DML objective. The objective is made of the Proxy Anchor loss along with the SO regularization. We evaluate our method on four public benchmarks for category-level image retrieval and demonstrate its effectiveness with comprehensive experimental results and ablation studies. Our evaluations demonstrate the superiority of our proposed approach over state-of-the-art methods by a significant margin.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Deep metric learning: A survey. Symmetry, 11(9):1066, 2019.
  2. Deep metric learning to rank. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1861–1870, 2019.
  3. Variance-preserving deep metric learning for content-based image retrieval. Pattern Recognition Letters, 131:8–14, 2020.
  4. Kihyuk Sohn. Improved deep metric learning with multi-class n-pair loss objective. Advances in neural information processing systems, 29, 2016.
  5. Ranked list loss for deep metric learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5207–5216, 2019.
  6. Deep metric learning for person re-identification. In 2014 22nd international conference on pattern recognition, pages 34–39. IEEE, 2014.
  7. Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In Proceedings of the iEEE conference on computer vision and pattern recognition, pages 1335–1344, 2016.
  8. Equidistance constrained metric learning for person re-identification. Pattern Recognition, 74:38–51, 2018.
  9. Ww-nets: Dual neural networks for object detection. In IJCNN, 2020.
  10. Ventral-dorsal neural networks: object detection via selective attention. In WACV, 2019.
  11. Multi-head deep metric learning using global and local representations. In WACV, 2022.
  12. Training vision transformers for image retrieval. arXiv preprint arXiv:2102.05644, 2021.
  13. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  14. Training data-efficient image transformers & distillation through attention. In ICML, 2021.
  15. Attention is all you need. NIPS, 2017.
  16. Proxy anchor loss for deep metric learning. In CVPR, 2020.
  17. Multi-similarity loss with general pair weighting for deep metric learning. In CVPR, 2019.
  18. Can we gain more from orthogonality regularizations in training deep networks? NIPS, 2018.
  19. Learning fine-grained image similarity with deep ranking. In CVPR, 2014.
  20. Deep metric learning using triplet network. In ISPRW, 2015.
  21. Facenet: A unified embedding for face recognition and clustering. In CVPR, 2015.
  22. Signature verification using a” siamese” time delay neural network. NIPS, 1993.
  23. Learning a similarity metric discriminatively, with application to face verification. In CVPR, 2005.
  24. Dimensionality reduction by learning an invariant mapping. In CVPR, 2006.
  25. No fuss distance metric learning using proxies. In CVPR, 2017.
  26. Long short-term memory. Neural computation, 1997.
  27. Sequence to sequence learning with neural networks. NIPS, 2014.
  28. Alex Sherstinsky. Fundamentals of recurrent neural network (rnn) and long short-term memory (lstm) network. Physica D: Nonlinear Phenomena, 2020.
  29. Attention mechanisms and their applications to complex systems. Entropy, 2021.
  30. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  31. Roberta: A robustly optimized bert pre-training approach. arXiv preprint arXiv:1907.11692, 2019.
  32. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942, 2019.
  33. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.
  34. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461, 2019.
  35. Improving language understanding by generative pre-training. 2018.
  36. Language models are unsupervised multitask learners. OpenAI blog, 2019.
  37. Language models are few-shot learners. NIPS, 2020.
  38. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  39. Designing network design spaces. In CVPR, 2020.
  40. The caltech-ucsd birds-200-2011 dataset. 2011.
  41. 3d object representations for fine-grained categorization. In CVPRW, 2013.
  42. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In CVPR, 2016.
  43. Deep metric learning via lifted structured feature embedding. In CVPR, 2016.
  44. A metric learning reality check. In ECCV, 2020.
  45. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  46. Pytorch: An imperative style, high-performance deep learning library. NIPS, 2019.
  47. Proxynca++: Revisiting and revitalizing proxy neighborhood component analysis. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16, pages 448–464. Springer, 2020.
  48. Weifeng Ge. Deep metric learning with hierarchical triplet loss. In Proceedings of the European Conference on Computer Vision (ECCV), pages 269–285, 2018.
  49. Softtriple loss: Deep metric learning without triplet sampling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6450–6458, 2019.
  50. Hard-aware deeply cascaded embedding. In Proceedings of the IEEE International Conference on Computer Vision, pages 814–823, 2017.
  51. Deep metric learning with bier: Boosting independent embeddings robustly. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):276–290, 2018.
  52. Attention-based ensemble for deep metric learning. In Proceedings of the European conference on computer vision (ECCV), pages 736–751, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
Citations (1)

Summary

We haven't generated a summary for this paper yet.