Learning to Hash with Binary Deep Neural Network (1607.05140v1)

Published 18 Jul 2016 in cs.CV

Abstract: This work proposes deep network models and learning algorithms for unsupervised and supervised binary hashing. Our novel network design constrains one hidden layer to directly output the binary codes. This addresses a challenging issue in some previous works: optimizing non-smooth objective functions due to binarization. Moreover, we incorporate independence and balance properties in the direct and strict forms in the learning. Furthermore, we include similarity preserving property in our objective function. Our resulting optimization with these binary, independence, and balance constraints is difficult to solve. We propose to attack it with alternating optimization and careful relaxation. Experimental results on three benchmark datasets show that our proposed methods compare favorably with the state of the art.

Authors (3)

Thanh-Toan Do (92 papers)
Anh-Dzung Doan (18 papers)
Ngai-Man Cheung (80 papers)

Citations (175)

View on Semantic Scholar

Summary

Learning to Hash with Binary Deep Neural Network

The paper "Learning to Hash with Binary Deep Neural Network" presents innovative approaches for binary hashing in both unsupervised and supervised contexts. The authors focus on addressing practical challenges associated with large-scale visual search tasks, specifically concerning efficient storage and fast querying mechanisms. Hashing techniques aim to transform high-dimensional data vectors into compact binary representations, facilitating rapid similarity-based retrieval.

Key Contributions

Network Architecture:
- The paper introduces a deep network architecture that prioritizes direct binary code generation from one of its hidden layers. By bypassing indirect binarization methods, such as sgn functions, the network reduces the non-smooth objective functions typically encountered in prior approaches.
Objective Function Optimization:
- The authors design objective functions with constraints emphasizing independence and balance properties of binary codes. These constraints are implemented in stringent forms to maintain similarity-preserving characteristics throughout the learning process.
- The unsupervised method, UH-BDNN, uses a combination of alternating optimization and careful relaxation to solve the inherently difficult binary-constrained optimization challenge.
Supervised Hashing Extension:
- The supervised method, SH-BDNN, extends the architecture with a focus on semantic similarity preservation, guided by pairwise label matrices. This ensures that within-class samples have minimized Hamming distances, while between-class samples are maximized.

Experimental Results

The researchers conduct extensive experiments across several benchmark datasets, including CIFAR10, MNIST, and SIFT1M. Their proposed models demonstrate competitive performance compared to leading hashing methods such as ITQ, BA, and DH, especially in longer hash lengths (24 and 32 bits). Metrics such as mean Average Precision (mAP) and precision at Hamming radius 2 (precision@2) substantiate the improved capability of their models.

Implications and Future Directions

The implications of this research are multifaceted, impacting large-scale data retrieval systems where speed and storage efficiency are crucial. By ensuring independence and balanced bit distribution, the proposed methods potentially enhance hash collision resistance and retrieval precision. Moreover, the integration of semantic preservation in supervised hashing could have profound effects on fields requiring nuanced categorization and classification, such as image recognition and object detection.

In terms of future work, expanding these principles to more complex data structures, such as non-linear embeddings, could further refine hashing techniques. Moreover, leveraging modern advancements in neural network architectures may enable even more robust and flexible implementations of binary hashing frameworks in various digital environments. These methodologies provide a foundation upon which further innovations can be realized, particularly in the deterministic generation of compact, reliable binary codes.