Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Modality-Aware Representation Learning for Zero-shot Sketch-based Image Retrieval (2401.04860v1)

Published 10 Jan 2024 in cs.CV

Abstract: Zero-shot learning offers an efficient solution for a machine learning model to treat unseen categories, avoiding exhaustive data collection. Zero-shot Sketch-based Image Retrieval (ZS-SBIR) simulates real-world scenarios where it is hard and costly to collect paired sketch-photo samples. We propose a novel framework that indirectly aligns sketches and photos by contrasting them through texts, removing the necessity of access to sketch-photo pairs. With an explicit modality encoding learned from data, our approach disentangles modality-agnostic semantics from modality-specific information, bridging the modality gap and enabling effective cross-modal content retrieval within a joint latent space. From comprehensive experiments, we verify the efficacy of the proposed model on ZS-SBIR, and it can be also applied to generalized and fine-grained settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. More photos are all you need: Semi-supervised learning for fine-grained sketch based image retrieval. In CVPR, 2021.
  2. CrossATNet - a novel cross-attention based framework for sketch-based image retrieval. Image and Vision Computing, 104:104003, 2020.
  3. BDA-SketRet: Bi-level domain adaptation for zero-shot sbir. Neurocomput., 514(C):245–255, dec 2022.
  4. LiveSketch: Query perturbations for guided sketch-based visual search. In CVPR, 2019.
  5. Doodle to search: Practical zero-shot sketch-based image retrieval. In CVPR, 2019.
  6. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
  7. Semantically tied paired cycle consistency for zero-shot sketch-based image retrieval. In CVPR, 2019.
  8. How do humans sketch objects? ACM Trans. Graph. (Proc. SIGGRAPH), 31(4):44:1–44:10, 2012.
  9. Semi-transductive learning for generalized zero-shot sketch-based image retrieval. In AAAI, volume 37, 2023.
  10. Sketch-based image retrieval using generative adversarial networks. In ACM MM, 2017.
  11. Sketch-based image retrieval with deep visual semantic descriptor. Pattern Recognition, 76:537–548, 2018.
  12. Augmented multimodality fusion for generalized zero-shot sketch-based visual retrieval. IEEE Transactions on Image Processing, 31:3657–3668, 2022.
  13. Semi-heterogeneous three-way joint embedding network for sketch-based image retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 30(9):3226–3237, 2020.
  14. Mind the gap: Understanding the modality gap in multi-modal contrastive representation learning. In NIPS, 2022.
  15. Zero-shot everything sketch-based image retrieval, and in explainable style. In CVPR, 2023.
  16. TC-Net for iSBIR: Triplet classification network for instance-level sketch based image retrieval. In ACM MM, 2019.
  17. Deep sketch hashing: Fast free-hand sketch-based image retrieval. In CVPR, 2017.
  18. Semantic-aware knowledge preservation for zero-shot sketch-based image retrieval. In ICCV, 2019.
  19. Domain-aware se network for sketch-based image retrieval with multiplicative euclidean margin softmax. In ACM MM, 2021.
  20. Generalising fine-grained sketch-based image retrieval. In CVPR, 2019.
  21. Solving mixed-modal jigsaw puzzle for fine-grained sketch-based image retrieval. In CVPR, 2020.
  22. Automatic differentiation in pytorch. In NIPSW, 2017.
  23. Learning transferable visual models from natural language supervision. In ICML, 2021.
  24. Leo Sampaio Ferraz Ribeiro and Moacir Antonelli Ponti. Sketch-an-anchor: Sub-epoch fast model adaptation for zero-shot sketch-based image retrieval. arxiv:2303.16769, 2023.
  25. Clip for all things zero-shot sketch-based image retrieval, fine-grained or not. In CVPR, 2023.
  26. Exploiting unlabelled photos for stronger fine-grained sbir. In CVPR, 2023.
  27. Sketch3T: Test-time training for zero-shot sbir. In CVPR, 2022.
  28. The sketchy database: Learning to retrieve badly drawn bunnies. ACM Trans. Graph., 35(4), jul 2016.
  29. Generalizing across domains via cross-gradient training. In ICLR, 2018.
  30. Zero-shot sketch-image hashing. In CVPR, 2018.
  31. Towards understanding the modality gap in CLIP. In ICLRW, 2023.
  32. Deep spatial-semantic attention for fine-grained sketch-based image retrieval. In ICCV, 2017.
  33. DLI-Net: Dual local interaction network for fine-grained sketch-based image retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 32(10):7177–7189, 2022.
  34. TVT: Three-way vision transformer through multi-modal hypersphere learning for zero-shot sketch-based image retrieval. In AAAI, 2022.
  35. Relationship-preserving knowledge distillation for zero-shot sketch based image retrieval. In ACM MM, 2021.
  36. Transferable coupled network for zero-shot sketch-based image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12):9181–9194, 2022.
  37. Prototype-based selective knowledge distillation for zero-shot sketch based image retrieval. In ACM MM, 2022.
  38. Sketch-based image retrieval with multi-clustering re-ranking. IEEE Transactions on Circuits and Systems for Video Technology, 30(12):4929–4943, 2020.
  39. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In ICML, 2020.
  40. A zero-shot framework for sketch based image retrieval. In ECCV, 2018.
  41. Sketch me that shoe. In CVPR, 2016.
  42. SketchNet: Sketch classification with web images. In CVPR, 2016.
  43. Generative domain-migration hashing for sketch-to-image retrieval. In Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss, editors, ECCV, 2018.
  44. Ocean: A dual learning approach for generalized zero-shot sketch-based image retrieval. In IEEE International Conference on Multimedia and Expo (ICME), 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Eunyi Lyou (3 papers)
  2. Doyeon Lee (1 paper)
  3. Jooeun Kim (3 papers)
  4. Joonseok Lee (39 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.