Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval (1703.05605v1)

Published 16 Mar 2017 in cs.CV

Abstract: Free-hand sketch-based image retrieval (SBIR) is a specific cross-view retrieval task, in which queries are abstract and ambiguous sketches while the retrieval database is formed with natural images. Work in this area mainly focuses on extracting representative and shared features for sketches and natural images. However, these can neither cope well with the geometric distortion between sketches and images nor be feasible for large-scale SBIR due to the heavy continuous-valued distance computation. In this paper, we speed up SBIR by introducing a novel binary coding method, named \textbf{Deep Sketch Hashing} (DSH), where a semi-heterogeneous deep architecture is proposed and incorporated into an end-to-end binary coding framework. Specifically, three convolutional neural networks are utilized to encode free-hand sketches, natural images and, especially, the auxiliary sketch-tokens which are adopted as bridges to mitigate the sketch-image geometric distortion. The learned DSH codes can effectively capture the cross-view similarities as well as the intrinsic semantic correlations between different categories. To the best of our knowledge, DSH is the first hashing work specifically designed for category-level SBIR with an end-to-end deep architecture. The proposed DSH is comprehensively evaluated on two large-scale datasets of TU-Berlin Extension and Sketchy, and the experiments consistently show DSH's superior SBIR accuracies over several state-of-the-art methods, while achieving significantly reduced retrieval time and memory footprint.

PDF Abstract

Deep Sketch Hashing: Efficient Free-hand Sketch-Based Image Retrieval

The reviewed paper presents the development of Deep Sketch Hashing (DSH), an innovative approach for facilitating efficient retrieval of natural images based on free-hand sketches. The technique specifically addresses the challenges associated with large-scale sketch-based image retrieval (SBIR) by introducing a binary coding scheme that mitigates common barriers such as geometric distortions between sketches and images, as well as computational inefficiencies.

Unlike traditional content-based image retrieval (CBIR) and text-based approaches, SBIR demands specialized methods to interpret abstract query sketches and match them effectively with natural images. Previous methodologies in SBIR have grappled with discrepancies between sketches' abstract nature and the details in natural images. Moreover, such methods often involve computationally intensive processes, reducing their feasibility in large-scale scenarios.

The paper introduces DSH, which incorporates a semi-heterogeneous deep learning architecture specifically designed to enhance SBIR. The proposed model includes three convolutional neural networks (CNNs) that process sketches, images, and 'sketch-tokens'—intermediate representations that bridge the gap between sketches and images. This bridging is crucial in compensating for geometric distortions typically observed between free-hand sketches and images. By using these sketch-tokens, the model effectively recognizes cross-view similarities and intrinsic semantic correlations between different categories.

DSH is reportedly the first hashing framework tailored for category-level SBIR employing an end-to-end deep architecture. The framework has been evaluated on TU-Berlin Extension and Sketchy datasets, where it demonstrated superior performance against several state-of-the-art methods. Specifically, DSH showcased notable improvements in retrieval accuracy, reduced retrieval times, and minimized memory consumption, underscoring its potential for practical applications such as real-time image retrieval on devices with constrained resources.

One of the highlights of this research is the introduction of the notion of 'sketch-tokens,' which serve as a pseudo-alignment method to counteract the irregularity in sketches' geometry compared to natural images. This novel approach, incorporated within the deep architecture, enables DSH to maintain a low computation cost while achieving high retrieval performance. The method supports binary encoding, which is highly beneficial in achieving reduced computational loads during data retrieval.

The implications of this work are multi-faceted. Practically, the proposed method can be incorporated into real-world systems where quick retrieval from massive image databases is required, such as in mobile and wearable technology applications. Theoretically, this work provides a blueprint for future research in cross-modal retrieval tasks as it emphasizes the importance of intermediate representations (such as sketch-tokens) to improve the alignment between disparate data forms.

Future developments in AI could explore the extension of DSH to other cross-domain retrieval tasks, exploring various forms of auxiliary data besides sketch-tokens to improve accuracy and efficiency further. Moreover, enhancing the robustness of such models to handle even more abstract queries remains a viable direction.

In summary, the paper introduces a sophisticated approach to handling the SBIR challenge by leveraging deep learning and binary hashing in a holistic framework. It marks a significant advancement in ensuring efficient retrieval processes, paving the way for further innovations in AI-driven image retrieval systems.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Li Liu (311 papers)
Fumin Shen (50 papers)
Yuming Shen (18 papers)
Xianglong Liu (128 papers)
Ling Shao (244 papers)

Citations (260)

View on Semantic Scholar

Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval (1703.05605v1)

Deep Sketch Hashing: Efficient Free-hand Sketch-Based Image Retrieval

Related Papers