Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Asymmetric Deep Supervised Hashing (1707.08325v1)

Published 26 Jul 2017 in cs.LG and stat.ML

Abstract: Hashing has been widely used for large-scale approximate nearest neighbor search because of its storage and search efficiency. Recent work has found that deep supervised hashing can significantly outperform non-deep supervised hashing in many applications. However, most existing deep supervised hashing methods adopt a symmetric strategy to learn one deep hash function for both query points and database (retrieval) points. The training of these symmetric deep supervised hashing methods is typically time-consuming, which makes them hard to effectively utilize the supervised information for cases with large-scale database. In this paper, we propose a novel deep supervised hashing method, called asymmetric deep supervised hashing (ADSH), for large-scale nearest neighbor search. ADSH treats the query points and database points in an asymmetric way. More specifically, ADSH learns a deep hash function only for query points, while the hash codes for database points are directly learned. The training of ADSH is much more efficient than that of traditional symmetric deep supervised hashing methods. Experiments show that ADSH can achieve state-of-the-art performance in real applications.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Qing-Yuan Jiang (12 papers)
  2. Wu-Jun Li (57 papers)
Citations (227)

Summary

Asymmetric Deep Supervised Hashing for Efficient Large-Scale Nearest Neighbor Search

The paper "Asymmetric Deep Supervised Hashing" by Qing-Yuan Jiang and Wu-Jun Li introduces a novel method for deep supervised hashing aimed at improving large-scale approximate nearest neighbor (ANN) searches. The presented approach, termed Asymmetric Deep Supervised Hashing (ADSH), departs from conventional symmetric hashing paradigms by treating query points and database points differentially in the process of generating hash codes. This methodological divergence underlies several key claims made in the paper regarding computational efficiency and search performance.

Methodology Overview

ADSH stands out by learning deep hash functions solely for query points, while forgoing the training of hash functions for database points; rather, it directly derives hash codes for these points. This practice diverges sharply from symmetric strategies that compute one hash function applicable universally to both solution elements, culminating in a more time-efficient training process particularly suited for scale-intensive datasets. The framework incorporates a convolutional neural network (CNN) to facilitate feature learning, actively integrating it with the hashing function learning in an end-to-end manner.

The core optimization problem is formulated to minimize the discrepancy between given pairwise supervisory data and the derived binary codes for data points, operationalized via a relaxed continuous representation to support gradient-based learning. Alternating optimization independently updates the neural network parameters and the binary codes via backpropagation, ultimately solving the revised hashing objective.

Experimental Setup and Results

The empirical evaluation leverages two large benchmark datasets, CIFAR-10 and NUS-WIDE, to substantiate the efficacy of ADSH against notable baseline methods. The authors report state-of-the-art performance by demonstrating that ADSH attains superior mean average precision (MAP) across varying binary code lengths, consistently outperforming existing deep and non-deep methods. Notably, the algorithm achieves this with markedly reduced computational time in comparison to symmetric alternatives, a point reinforced by enumeration of time complexity analysis and experimental timing results.

Implications and Future Directions

ADSH confers several practical advantages pertinent to real-world ANN applications. Predominantly, its capability to deploy entire databases for training without excessive computational constraints promises applicability to burgeoning data contexts typified by modern information retrieval and computer vision tasks. The demonstrated reduction in training time, coupled with enhanced accuracy, signifies a tangible methodology for organizations and researchers dealing with voluminous data landscapes.

From a theoretical standpoint, ADSH proposes a compelling interrogation of conventional symmetric hashing wisdom within supervised deep learning. As dataset scales expand and computational resources remain constant, asymmetric strategies of this nature might emerge pivotal to balancing performance-strategy efficacy.

Conclusion

This research advances deep supervised hashing by conceptualizing an asymmetric approach and empirically validating its advantages across critical evaluation metrics. The paper proffers substantial implications both in methodological explorations and practical deployments of deep learning-powered ANN searches, setting a foundation for further exploration and refinement of asymmetric hashing strategies in the field of large-scale machine learning. Subsequent inquiries may extend this work to adaptive asymmetric frameworks potentially incorporating dynamic hashing mechanisms responsive to emergent dataset characteristics.