Deep Supervised Discrete Hashing (1705.10999v2)

Published 31 May 2017 in cs.CV

Abstract: With the rapid growth of image and video data on the web, hashing has been extensively studied for image or video search in recent years. Benefit from recent advances in deep learning, deep hashing methods have achieved promising results for image retrieval. However, there are some limitations of previous deep hashing methods (e.g., the semantic information is not fully exploited). In this paper, we develop a deep supervised discrete hashing algorithm based on the assumption that the learned binary codes should be ideal for classification. Both the pairwise label information and the classification information are used to learn the hash codes within one stream framework. We constrain the outputs of the last layer to be binary codes directly, which is rarely investigated in deep hashing algorithm. Because of the discrete nature of hash codes, an alternating minimization method is used to optimize the objective function. Experimental results have shown that our method outperforms current state-of-the-art methods on benchmark datasets.

PDF Abstract

Deep Supervised Discrete Hashing: An In-Depth Review

The paper "Deep Supervised Discrete Hashing" presents a novel approach to image retrieval using a deep learning framework that optimizes the encoding of high-dimensional data into binary hash codes. This methodology addresses limitations in prior deep hashing models by incorporating both pairwise label information and classification information in a single-stream framework, ensuring the output directly corresponds to binary codes. Here's a detailed exploration of the paper's contributions, methodology, and results.

Background and Motivation

Hashing techniques have become indispensable for efficient image retrieval due to their low storage requirements and fast query processing capabilities. Traditional hashing approaches, while valuable, often fail to fully leverage the semantic information available in labeled datasets. This paper argues that more effective hash codes can be learned through a deep learning approach that directly outputs binary representations optimized for classification.

Methodology

Problem Formulation

The objective of the research is to learn hash functions that transform image data into binary codes while preserving semantic similarity. The key challenge is achieving this through a deep learning framework that forces the last layer of a Convolutional Neural Network (CNN) to output discrete binary codes.

Learning Framework

This research introduces an innovative approach that uses:

Pairwise Label Information: This ensures similar items are encoded with binary codes close in Hamming space, while dissimilar items have codes that are far apart.
Classification Information: Incorporated directly into the hash learning process, unlike other multi-task frameworks where classification aids only in feature learning.
Discrete Optimization: Through an alternating minimization method that respects the inherently discrete nature of hash codes, ensuring minimal quantization error and retaining binary integrity through the optimization process.

Algorithm Overview

Their method involves using CNNs to extract features, which are then fed into a final layer constrained to produce binary outputs using a sign function. The optimization challenge, given the non-differentiability of the binary constraints, is overcome via an auxiliary variable approach to allow effective back-propagation and learning.

Experimental Results

The paper claims significant improvements over state-of-the-art hashing techniques on datasets like CIFAR-10 and NUS-WIDE, demonstrating effectiveness through several metrics:

Mean Average Precision (MAP) values notably higher than existing techniques, indicating superior retrieval accuracy.
The method outperformed previous works by leveraging combined loss functions that integrate semantic label consistency and classification potential.
Robust performance in setups with varying dataset sizes and challenge levels, showcasing scalability and adaptability of their model.

Implications and Future Directions

The implications of this work are substantial for large-scale image retrieval systems. By directly learning binary codes suitable for multiple classifications while maintaining semantic coherence, this method has the potential to enhance retrieval performance in real-world applications like content-based image retrieval systems.

Possible future directions include exploring advanced architectures beyond CNN-F used in this paper, integrating this hashing approach with other multi-label datasets, and examining the scalability of the approach on even larger datasets. Additionally, a deeper investigation into reducing the computational load during training could further amplify the practical application of these hashing methods.

In conclusion, the "Deep Supervised Discrete Hashing" paper sets a new standard in retrieval systems by integrating classification insight directly into the hash learning process in a discrete manner, paving the way for further exploration and development in efficient image and video retrieval systems.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Qi Li (352 papers)
Zhenan Sun (80 papers)
Ran He (172 papers)
Tieniu Tan (119 papers)

Citations (286)

View on Semantic Scholar