Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval (2007.15103v2)

Published 29 Jul 2020 in cs.CV and cs.IR

Abstract: Sketch as an image search query is an ideal alternative to text in capturing the fine-grained visual details. Prior successes on fine-grained sketch-based image retrieval (FG-SBIR) have demonstrated the importance of tackling the unique traits of sketches as opposed to photos, e.g., temporal vs. static, strokes vs. pixels, and abstract vs. pixel-perfect. In this paper, we study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail -- a person typically sketches up to various extents of detail to depict an object. This hierarchical structure is often visually distinct. In this paper, we design a novel network that is capable of cultivating sketch-specific hierarchies and exploiting them to match sketch with photo at corresponding hierarchical levels. In particular, features from a sketch and a photo are enriched using cross-modal co-attention, coupled with hierarchical node fusion at every level to form a better embedding space to conduct retrieval. Experiments on common benchmarks show our method to outperform state-of-the-arts by a significant margin.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Aneeshan Sain (40 papers)
  2. Ayan Kumar Bhunia (63 papers)
  3. Yongxin Yang (73 papers)
  4. Tao Xiang (324 papers)
  5. Yi-Zhe Song (120 papers)
Citations (47)

Summary

We haven't generated a summary for this paper yet.