Papers
Topics
Authors
Recent
2000 character limit reached

Learn and Search: An Elegant Technique for Object Lookup using Contrastive Learning (2403.07231v1)

Published 12 Mar 2024 in cs.CV

Abstract: The rapid proliferation of digital content and the ever-growing need for precise object recognition and segmentation have driven the advancement of cutting-edge techniques in the field of object classification and segmentation. This paper introduces "Learn and Search", a novel approach for object lookup that leverages the power of contrastive learning to enhance the efficiency and effectiveness of retrieval systems. In this study, we present an elegant and innovative methodology that integrates deep learning principles and contrastive learning to tackle the challenges of object search. Our extensive experimentation reveals compelling results, with "Learn and Search" achieving superior Similarity Grid Accuracy, showcasing its efficacy in discerning regions of utmost similarity within an image relative to a cropped image. The seamless fusion of deep learning and contrastive learning to address the intricacies of object identification not only promises transformative applications in image recognition, recommendation systems, and content tagging but also revolutionizes content-based search and retrieval. The amalgamation of these techniques, as exemplified by "Learn and Search," represents a significant stride in the ongoing evolution of methodologies in the dynamic realm of object classification and segmentation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Additive quantization for extreme vector compression. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 931–938, 2014.
  2. Moses S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, page 380–388, New York, NY, USA, 2002. Association for Computing Machinery.
  3. A simple framework for contrastive learning of visual representations, 2020a.
  4. Image search with text feedback by visiolinguistic attention learning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2998–3008, 2020b.
  5. Deep cnn based binary hash video representations for face retrieval. Pattern Recognition, 81:357–369, 2018.
  6. Average biased relu based cnn descriptor for improved face retrieval. Multimedia Tools and Applications, 80(15):23181–23206, 2021.
  7. Local bit-plane decoded convolutional neural network features for biomedical image retrieval. Neural Computing and Applications, 32:7539–7551, 2019.
  8. Learning local image descriptors with deep siamese and triplet convolutional networks by minimising global loss functions, 2016.
  9. Unsupervised image style embeddings for retrieval and recognition tasks. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 3270–3278, 2020.
  10. Optimized product quantization for approximate nearest neighbor search. In 2013 IEEE Conference on Computer Vision and Pattern Recognition, pages 2946–2953, 2013.
  11. Unsupervised representation learning by predicting image rotations, 2018.
  12. Understanding and improving the role of projection head in self-supervised learning, 2022.
  13. Deep residual learning for image recognition, 2015.
  14. Momentum contrast for unsupervised visual representation learning, 2020.
  15. Claws: Contrastive learning with hard attention and weak supervision, 2022.
  16. Self-supervised product quantization for deep unsupervised image retrieval, 2022.
  17. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1):117–128, 2011.
  18. End-to-end supervised product quantization for image search and retrieval, 2020.
  19. Discerning self-supervised learning and weakly supervised learning, 2023.
  20. Unsupervised learning based object detection using contrastive learning, 2024.
  21. Deep supervised discrete hashing, 2017.
  22. Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, page 499–502, New York, NY, USA, 2015a. Association for Computing Machinery.
  23. Microsoft coco: Common objects in context, 2015b.
  24. Focal loss for dense object detection, 2018.
  25. Similarity-based unsupervised deep transfer learning for remote sensing image retrieval. IEEE Transactions on Geoscience and Remote Sensing, 58(11):7872–7889, 2020.
  26. Image retrieval on real-life images with pre-trained vision-and-language models, 2021.
  27. David G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60:91–110, 2004.
  28. Unsupervised learning of visual representations by solving jigsaw puzzles, 2017.
  29. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42:145–175, 2001.
  30. Learning transferable visual models from natural language supervision, 2021.
  31. Semantic hashing. International Journal of Approximate Reasoning, 50(7):969–978, 2009. Special Section on Graphical Models and Information Retrieval.
  32. Composing text and image for image retrieval - an empirical odyssey, 2018.
  33. Contrastive quantization with code memory for unsupervised image retrieval, 2022.
  34. Co2: Consistent contrast for unsupervised visual representation learning, 2020.
  35. Spectral hashing. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2008.
  36. Unsupervised feature learning via non-parametric instance discrimination. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3733–3742, 2018.
  37. Dame web: Dynamic mean with whitening ensemble binarization for landmark retrieval without human annotation. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 2913–2922, 2019.
  38. Product quantization network for fast visual search. Int. J. Comput. Vision, 128(8–9):2325–2343, 2020.
  39. Central similarity quantization for efficient image and video retrieval, 2020.
  40. Play and rewind: Optimizing binary representations of videos by self-supervised temporal hashing. Proceedings of the 24th ACM international conference on Multimedia, 2016.
  41. Deep hashing with triplet quantization loss, 2017.
  42. Dual-level semantic transfer deep hashing for efficient social image retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 31(4):1478–1489, 2021.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.