LSDA: Large Scale Detection Through Adaptation (1407.5035v3)

Published 18 Jul 2014 in cs.CV

Abstract: A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) have emerged as clear winners on object classification benchmarks, in part due to training with 1.2M+ labeled classification images. Unfortunately, only a small fraction of those labels are available for the detection task. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect detection data and label it with precise bounding boxes. In this paper, we propose Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors. Our method has the potential to enable detection for the tens of thousands of categories that lack bounding box annotations, yet have plenty of classification data. Evaluation on the ImageNet LSVRC-2013 detection challenge demonstrates the efficacy of our approach. This algorithm enables us to produce a >7.6K detector by using available classification data from leaf nodes in the ImageNet tree. We additionally demonstrate how to modify our architecture to produce a fast detector (running at 2fps for the 7.6K detector). Models and software are available at

PDF Abstract

An Expert Review of "LSDA: Large Scale Detection through Adaptation"

The paper "LSDA: Large Scale Detection through Adaptation" addresses a significant challenge in the field of computer vision: the scalability of object detection across a wide array of categories. Unlike object classification, which benefits from a vast collection of labeled images, object detection is hindered by the limited availability of images labeled with precise bounding boxes. This discrepancy in data availability between classification and detection tasks impedes the ability to extend detection capabilities to a broader spectrum of categories.

This research presents a novel algorithm named Large Scale Detection through Adaptation (LSDA). The LSDA framework is designed to bridge the gap between deep convolutional neural networks' success in classification and the dearth of annotated data in detection tasks. By learning the discrepancy between classification and detection tasks, LSDA transfers this knowledge to classifiers of categories that lack bounding box annotations, effectively converting them into detectors. This advancement facilitates the creation of detectors for numerous categories that have ample classification data yet lack detailed annotation necessary for detection.

Empirically, LSDA demonstrates its utility on the ImageNet LSVRC-2013 detection challenge, affirming its capacity to expand detection to over 7,600 categories using classification data from leaf nodes in the ImageNet hierarchy. This approach signifies a remarkable stride, as it enables the creation of extensive category detectors without the reliance on manually intensive bounding box annotations. Furthermore, the adaptability of the framework is showcased by modifications to the architecture that expedite detection, achieving processing speeds of 2 frames per second for the comprehensive 7.6K detector model.

The implications of the LSDA methodology are substantial for both the practical application and theoretical understanding of neural network adaptation. Practically, LSDA stands as a transformative approach that mitigates the costly requirement of acquiring and annotating a large volume of detection data, thus reducing barriers to deploying object detection technologies across diverse real-world applications. Theoretically, the introduction of LSDA fuels further inquiry into transfer learning and adaptation mechanisms within neural networks, providing a template for leveraging classification knowledge in detection endeavors.

As the field of artificial intelligence progresses, methodologies like LSDA may direct the future trajectory of object detection research, spurring developments in algorithm efficiency, accuracy, and scalability. The results of this paper suggest a fertile ground for exploration, possibly inspiring enhancements in automated labeling techniques and improvements in detection algorithms’ adaptability to varied and expansive datasets. The paper provides a solid foundation for researchers to further augment the capabilities of detectors and explore cross-domain adaptation strategies in other AI contexts.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Judy Hoffman (75 papers)
Sergio Guadarrama (19 papers)
Eric Tzeng (17 papers)
Ronghang Hu (26 papers)
Jeff Donahue (26 papers)
Ross Girshick (75 papers)
Trevor Darrell (324 papers)
Kate Saenko (178 papers)

Citations (332)

View on Semantic Scholar

LSDA: Large Scale Detection Through Adaptation (1407.5035v3)

An Expert Review of "LSDA: Large Scale Detection through Adaptation"

Related Papers