Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 172 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 203 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Transformer-based Clipped Contrastive Quantization Learning for Unsupervised Image Retrieval (2401.15362v1)

Published 27 Jan 2024 in cs.CV

Abstract: Unsupervised image retrieval aims to learn the important visual characteristics without any given level to retrieve the similar images for a given query image. The Convolutional Neural Network (CNN)-based approaches have been extensively exploited with self-supervised contrastive learning for image hashing. However, the existing approaches suffer due to lack of effective utilization of global features by CNNs and biased-ness created by false negative pairs in the contrastive learning. In this paper, we propose a TransClippedCLR model by encoding the global context of an image using Transformer having local context through patch based processing, by generating the hash codes through product quantization and by avoiding the potential false negative pairs through clipped contrastive learning. The proposed model is tested with superior performance for unsupervised image retrieval on benchmark datasets, including CIFAR10, NUS-Wide and Flickr25K, as compared to the recent state-of-the-art deep models. The results using the proposed clipped contrastive learning are greatly improved on all datasets as compared to same backbone network with vanilla contrastive learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. S.R. Dubey, “A decade survey of content based image retrieval using deep learning,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 5, pp. 2687–2704, 2021.
  2. “Central similarity quantization for efficient image and video retrieval,” in IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 3083–3092.
  3. “Vision transformer hashing for image retrieval,” in 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2022, pp. 1–6.
  4. “Learning compact binary descriptors with unsupervised deep neural networks,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1183–1192.
  5. “Contrastive quantization with code memory for unsupervised image retrieval,” in AAAI Conference on Artificial Intelligence, 2022, pp. 2468–2476.
  6. “Self-supervised product quantization for deep unsupervised image retrieval,” in IEEE International Conference on Computer Vision, 2021, pp. 12085–12094.
  7. “Semantic structure-based unsupervised deep hashing,” in 27th International Joint Conference on Artificial Intelligence, 2018, pp. 1064–1070.
  8. “Auto-encoding twin-bottleneck hashing,” in IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 2818–2827.
  9. Y. Li and J.V. Gemert, “Deep unsupervised image hashing by maximizing bit entropy,” in AAAI Conference on Artificial Intelligence, 2021, pp. 2002–2010.
  10. “Unsupervised hashing with contrastive information bottleneck,” in International Joint Conference on Artificial Intelligence, 2021.
  11. “Learning deep binary descriptor with multi-quantization,” in IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1183–1192.
  12. “Learning deep unsupervised binary codes for image retrieval,” in International Joint Conference on Artificial Intelligence, 2018, pp. 613–619.
  13. “Debiased contrastive learning,” in Advances in Neural Information Processing Systems, 2020, pp. 8765–8775.
  14. “Effective and efficient negative sampling in metric learning based recommendation,” Information Sciences, vol. 605, pp. 351–365, 2022.
  15. “Attention is all you need,” Advances in Neural Information Processing Systems, 2017.
  16. “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations, 2021.
  17. “Dropout: a simple way to prevent neural networks from overfitting,” Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
  18. “Product quantization for nearest neighbor search,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 117–128, 2011.
  19. “Learning multiple layers of features from tiny images,” 2009.
  20. “Nus-wide: A real-world web image database from national university of singapore,” in ACM Int. Conference on Image and Video Retrieval, 2009.
  21. “The mir flickr retrieval evaluation,” in ACM SIGMM Int. Conference on Multimedia Information Retrieval, 2008, pp. 39–43.
  22. “Vit2hash: unsupervised information-preserving hashing,” arXiv preprint arXiv:2201.05541, 2022.
  23. “Pytorch: An imperative style, high-performance deep learning library,” Advances in Neural Information Processing Systems, vol. 32, 2019.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.