Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks (1507.00101v2)

Published 1 Jul 2015 in cs.CV

Abstract: This paper presents a simple yet effective supervised deep hash approach that constructs binary hash codes from labeled data for large-scale image search. We assume that the semantic labels are governed by several latent attributes with each attribute on or off, and classification relies on these attributes. Based on this assumption, our approach, dubbed supervised semantics-preserving deep hashing (SSDH), constructs hash functions as a latent layer in a deep network and the binary codes are learned by minimizing an objective function defined over classification error and other desirable hash codes properties. With this design, SSDH has a nice characteristic that classification and retrieval are unified in a single learning model. Moreover, SSDH performs joint learning of image representations, hash codes, and classification in a point-wised manner, and thus is scalable to large-scale datasets. SSDH is simple and can be realized by a slight enhancement of an existing deep architecture for classification; yet it is effective and outperforms other hashing approaches on several benchmarks and large datasets. Compared with state-of-the-art approaches, SSDH achieves higher retrieval accuracy, while the classification performance is not sacrificed.

Authors (3)

Huei-Fang Yang (3 papers)
Kevin Lin (98 papers)
Chu-Song Chen (28 papers)

Citations (241)

View on Semantic Scholar

Summary

Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks

The paper presents a robust methodology for generating efficient binary hash codes through a supervised deep learning approach, termed supervised semantics-preserving deep hashing (SSDH). This approach leverages deep convolutional neural networks (CNNs) to construct binary hashes from labeled images, optimizing a joint objective function that considers both classification accuracy and the quality of binary codes for retrieval.

Key Contributions and Methodology

The primary innovation of SSDH lies in its integration of hash function learning with image classification within a single deep learning model. By assuming that semantic labels are dictated by several latent binary attributes, the authors effectively embed these hash functions as a latent layer in the CNN architecture. This design facilitates the simultaneous learning of image representations, binary codes, and classification tasks, thereby unifying image retrieval and classification in a cohesive model.

This method optimizes an objective function comprising three facets: reducing classification error, enhancing the binarization property of hash codes, and ensuring code efficiency through balanced bit appearance. As a result, SSDH strikes a balance between semantic similarity and code efficiency, enabling it to achieve higher retrieval accuracy without sacrificing classification performance.

Numerical Results and Evaluations

Through extensive empirical evaluations on multiple datasets like CIFAR-10, NUS-WIDE, and Yahoo-1M, SSDH consistently outperforms existing state-of-the-art hashing methods. It achieves notable improvements in mean average precision (mAP), with gains often exceeding previous bests by significant margins. For example, on the CIFAR-10 dataset, SSDH improves mAP by approximately 34% compared to competitive methods, demonstrating its effective handling of semantic preservation in hash function learning.

In the context of large-scale datasets, such as Yahoo-1M, SSDH showcases its scalability by proficiently managing over a million images for training without computational strain, a feat less practical for pair- or triplet-based learning approaches due to their prohibitive demands on time and memory.

Practical Implications and Future Directions

The integration of SSDH offers substantial implications for both image retrieval efficiency and semantic consistency in classification tasks. Its ability to produce compact, semantics-preserving binary codes with high retrieval performance provides tangible benefits for real-world applications, such as image-based product searches in e-commerce and efficient handling of large-scale image datasets.

The paper also hints at potential future research avenues, including the possibility of extending SSDH's capabilities by augmenting semantic hashing with additional ranking or unsupervised criteria. This could further enhance its retrieval effectiveness by embedding notions of visual or feature similarity alongside semantic consistency, potentially leading to even more nuanced retrieval systems. Additionally, exploring semi-supervised variants of SSDH could offer a path to incorporate unlabeled data, increasing its applicability across diverse datasets.

In summary, the development of SSDH marks a significant step towards refining semantic hashing techniques, aligning image retrieval performance closely with deep learning advancements in classification, and setting a foundation for further exploration in the dynamically evolving domain of artificial intelligence.

PDF Markdown