Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Potential Field Based Deep Metric Learning (2405.18560v2)

Published 28 May 2024 in cs.CV, cs.AI, cs.IR, cs.LG, and eess.IV

Abstract: Deep metric learning (DML) involves training a network to learn a semantically meaningful representation space. Many current approaches mine n-tuples of examples and model interactions within each tuplets. We present a novel, compositional DML model, inspired by electrostatic fields in physics that, instead of in tuples, represents the influence of each example (embedding) by a continuous potential field, and superposes the fields to obtain their combined global potential field. We use attractive/repulsive potential fields to represent interactions among embeddings from images of the same/different classes. Contrary to typical learning methods, where mutual influence of samples is proportional to their distance, we enforce reduction in such influence with distance, leading to a decaying field. We show that such decay helps improve performance on real world datasets with large intra-class variations and label noise. Like other proxy-based methods, we also use proxies to succinctly represent sub-populations of examples. We evaluate our method on three standard DML benchmarks- Cars-196, CUB-200-2011, and SOP datasets where it outperforms state-of-the-art baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. N. Ahuja and Jen-Hui Chuang. Shape representation using a generalized potential field model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(2):169–176, 1997. doi: 10.1109/34.574801.
  2. Piecewise-linear manifolds for deep metric learning. In Conference on Parsimony and Learning (Proceedings Track), 2023. URL https://openreview.net/forum?id=Z0Fk5MyxkY.
  3. Rethinking zero-shot video classification: End-to-end training for realistic applications. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 4612–4622. Computer Vision Foundation / IEEE, 2020. doi: 10.1109/CVPR42600.2020.00467. URL https://openaccess.thecvf.com/content_CVPR_2020/html/Brattoli_Rethinking_Zero-Shot_Video_Classification_End-to-End_Training_for_Realistic_Applications_CVPR_2020_paper.html.
  4. Deep metric learning to rank. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1861–1870, 2019. doi: 10.1109/CVPR.2019.00196.
  5. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021.
  6. Deep metric learning for open world semantic segmentation. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 15313–15322, 2021.
  7. Beyond triplet loss: A deep quadruplet network for person re-identification. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 1320–1329. IEEE Computer Society, 2017. doi: 10.1109/CVPR.2017.145. URL https://doi.org/10.1109/CVPR.2017.145.
  8. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546. IEEE, 2005.
  9. Jen-Hui Chuang and N. Ahuja. An analytically tractable potential field model of free space and its application in obstacle avoidance. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 28(5):729–736, 1998. doi: 10.1109/3477.718522.
  10. Arcface: Additive angular margin loss for deep face recognition. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4685–4694, 2019. doi: 10.1109/CVPR.2019.00482.
  11. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=YicbFdNTTy.
  12. Hyperbolic vision transformers: Combining improvements in metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7409–7419, 2022.
  13. Deep metric learning with hierarchical triplet loss. In Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss, editors, Computer Vision – ECCV 2018, pages 272–288, Cham, 2018. Springer International Publishing. ISBN 978-3-030-01231-1.
  14. Deep metric learning for open-set human action recognition in videos. Neural Comput. Appl., 33(4):1207–1220, feb 2021. ISSN 0941-0643. doi: 10.1007/s00521-020-05009-z. URL https://doi.org/10.1007/s00521-020-05009-z.
  15. Smart mining for deep metric learning. In Proceedings of the IEEE international conference on computer vision, pages 2821–2829, 2017.
  16. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  17. Discriminative deep metric learning for face verification in the wild. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014.
  18. Y.K. Hwang and N. Ahuja. Path planning using a potential field representation. In Proceedings. 1988 IEEE International Conference on Robotics and Automation, pages 648–649 vol.1, 1988. doi: 10.1109/ROBOT.1988.12131.
  19. Y.K. Hwang and N. Ahuja. A potential field approach to path planning. IEEE Transactions on Robotics and Automation, 8(1):23–32, 1992. doi: 10.1109/70.127236.
  20. O. Khatib. Real-time obstacle avoidance for manipulators and mobile robots. In Proceedings. 1985 IEEE International Conference on Robotics and Automation, volume 2, pages 500–505, 1985. doi: 10.1109/ROBOT.1985.1087247.
  21. Proxy anchor loss for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
  22. Hier: Metric learning beyond class labels via hierarchical regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19903–19912, 2023.
  23. A non-isotropic probabilistic take on proxy-based deep metric learning. In Computer Vision - ECCV 2022 - 17th European Conference, Proceedings, Part XXVI, volume 13686 of Lecture Notes in Computer Science, pages 435–454. Springer, October 2022. doi: 10.1007/978-3-031-19809-0_25.
  24. 3D object representations for fine-grained categorization. In 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia, 2013.
  25. Rethinking preventing class-collapsing in metric learning with margin-based losses. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10316–10325, October 2021.
  26. Supervised metric learning to rank for retrieval via contextual similarity optimization. In International Conference on Machine Learning, pages 20906–20938. PMLR, 2023.
  27. Hypergraph-induced semantic tuplet loss for deep metric learning code. https://github.com/ljin0429/HIST, 2022a.
  28. Hypergraph-induced semantic tuplet loss for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 212–222, 2022b.
  29. Das: Densely-anchored sampling for deep metric learning. In European Conference on Computer Vision, pages 399–417. Springer, 2022.
  30. Sphereface: Deep hypersphere embedding for face recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  31. The importance of metric learning for robotic vision: Open set recognition and active learning. In 2019 International Conference on Robotics and Automation (ICRA), pages 2924–2931, 2019. doi: 10.1109/ICRA.2019.8794188.
  32. Diva: Diverse visual feature aggregation for deep metric learning. In Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VIII, page 590–607, Berlin, Heidelberg, 2020. Springer-Verlag. ISBN 978-3-030-58597-6. doi: 10.1007/978-3-030-58598-3_35. URL https://doi.org/10.1007/978-3-030-58598-3_35.
  33. No fuss distance metric learning using proxies. In Proceedings of the IEEE international conference on computer vision, pages 360–368, 2017.
  34. A metric learning reality check. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16, pages 681–699. Springer, 2020.
  35. Softtriple loss: Deep metric learning without triplet sampling. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
  36. Revisiting training strategies and generalization performance in deep metric learning. In Proceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020.
  37. Non-isotropy regularization for proxy-based deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7420–7430, 2022.
  38. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015.
  39. Prototypical networks for few-shot learning. Advances in neural information processing systems, 30, 2017.
  40. Kihyuk Sohn. Improved deep metric learning with multi-class n-pair loss objective. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016. URL https://proceedings.neurips.cc/paper_files/paper/2016/file/6b180037abbebea991d8b1232f8a8ca9-Paper.pdf.
  41. Deep metric learning via lifted structured feature embedding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  42. Deep metric learning via facility location. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2206–2214, 2017. doi: 10.1109/CVPR.2017.237.
  43. Circle loss: A unified perspective of pair similarity optimization. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6397–6406, 2020. doi: 10.1109/CVPR42600.2020.00643.
  44. Learning to compare: Relation network for few-shot learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1199–1208, 2017.
  45. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
  46. Proxynca++: Revisiting and revitalizing proxy neighborhood component analysis. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXIV, volume 12369 of Lecture Notes in Computer Science, pages 448–464. Springer, 2020. doi: 10.1007/978-3-030-58586-0\_27. URL https://doi.org/10.1007/978-3-030-58586-0_27.
  47. The caltech-ucsd birds-200-2011 dataset. 2011.
  48. Deep factorized metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7672–7682, 2023.
  49. Cosface: Large margin cosine loss for deep face recognition. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5265–5274, 2018.
  50. Multi-similarity loss with general pair weighting for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  51. Distance metric learning for large margin nearest neighbor classification. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems, volume 18. MIT Press, 2005. URL https://proceedings.neurips.cc/paper/2005/file/a7f592cef8b130a6967a90617db5681b-Paper.pdf.
  52. Sampling matters in deep embedding learning. In Proceedings of the IEEE international conference on computer vision, pages 2840–2848, 2017.
  53. Joint detection and identification feature learning for person search. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3415–3424, 2017.
  54. Improved embeddings with easy positive triplet mining. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 2463–2471, 2019.
  55. Hse: Hybrid species embedding for deep metric learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11047–11057, 2023.
  56. Signal-to-noise ratio: A robust distance metric for deep metric learning. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 4815–4824. Computer Vision Foundation / IEEE, 2019. doi: 10.1109/CVPR.2019.00495. URL http://openaccess.thecvf.com/content_CVPR_2019/html/Yuan_Signal-To-Noise_Ratio_A_Robust_Distance_Metric_for_Deep_Metric_Learning_CVPR_2019_paper.html.
  57. Hard-aware deeply cascaded embedding. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017.
  58. Classification is a strong baseline for deep metric learning. In British Machine Vision Conference, 2018.
  59. Deep compositional metric learning. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9316–9325, 2021a. doi: 10.1109/CVPR46437.2021.00920.
  60. Deep relational metric learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12065–12074, 2021b.
  61. Fewer is more: A deep graph metric learning perspective using fewer proxies. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 17792–17803. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/file/ce016f59ecc2366a43e1c96a4774d167-Paper.pdf.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Shubhang Bhatnagar (9 papers)
  2. Narendra Ahuja (32 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

HackerNews