Object Detection in Optical Remote Sensing Images: A Survey and A New Benchmark (1909.00133v2)

Published 31 Aug 2019 in cs.CV

Abstract: Substantial efforts have been devoted more recently to presenting various methods for object detection in optical remote sensing images. However, the current survey of datasets and deep learning based methods for object detection in optical remote sensing images is not adequate. Moreover, most of the existing datasets have some shortcomings, for example, the numbers of images and object categories are small scale, and the image diversity and variations are insufficient. These limitations greatly affect the development of deep learning based object detection methods. In the paper, we provide a comprehensive review of the recent deep learning based object detection progress in both the computer vision and earth observation communities. Then, we propose a large-scale, publicly available benchmark for object DetectIon in Optical Remote sensing images, which we name as DIOR. The dataset contains 23463 images and 192472 instances, covering 20 object classes. The proposed DIOR dataset 1) is large-scale on the object categories, on the object instance number, and on the total image number; 2) has a large range of object size variations, not only in terms of spatial resolutions, but also in the aspect of inter- and intra-class size variability across objects; 3) holds big variations as the images are obtained with different imaging conditions, weathers, seasons, and image quality; and 4) has high inter-class similarity and intra-class diversity. The proposed benchmark can help the researchers to develop and validate their data-driven methods. Finally, we evaluate several state-of-the-art approaches on our DIOR dataset to establish a baseline for future research.

Authors (5)

Ke Li (723 papers)
Gang Wan (3 papers)
Gong Cheng (78 papers)
Liqiu Meng (13 papers)
Junwei Han (87 papers)

Citations (1,248)

View on Semantic Scholar

Summary

Overview of "Object Detection in Optical Remote Sensing Images: A Survey and A New Benchmark"

The paper "Object Detection in Optical Remote Sensing Images: A Survey and A New Benchmark" by Ke Li, Gang Wan, Gong Cheng, Liqiu Meng, and Junwei Han provides a comprehensive examination of deep learning-based methodologies for object detection in optical remote sensing imagery. Furthermore, it proposes a new large-scale dataset, the DIOR (DetectIon in Optical Remote sensing images), which is publicly available for the research community.

Survey of Object Detection Methods

The paper meticulously reviews recent advancements in object detection, both within the computer vision and earth observation communities. It highlights the success of deep learning techniques, particularly Convolutional Neural Networks (CNNs), in object detection tasks. The review is divided into two main sections:

Object Detection in Natural Scene Images:
- Datasets: The paper reviews notable datasets such as PASCAL VOC, MSCOCO, and ImageNet Detection Challenge, all being instrumental in advancing object detection algorithms.
- Methods: The paper categorizes methods into region-proposal-based (e.g., R-CNN, Fast R-CNN, Faster R-CNN) and regression-based methods (e.g., YOLO, SSD, RetinaNet). Each methodology's evolution is articulated, emphasizing their mechanisms and improvements over predecessors.
Object Detection in Optical Remote Sensing Images:
- Datasets: Several datasets are discussed, including NWPU VHR-10, UCAS-AOD, and DOTA, noting their limitations concerning image diversity, object category scale, and size variations.
- Methods: Similar to natural scene images, methods for remote sensing are reviewed, often showcasing adaptations of region-proposal techniques like Faster R-CNN. Novel approaches tailored to handle unique attributes of remote sensing imagery, such as rotation invariance and multi-scale object detection, are explored.

Introduction of the DIOR Dataset

A critical contribution of this paper is the introduction of the DIOR dataset. The dataset is comprehensive, covering 23,463 images and 192,472 instances spanning 20 object categories. It presents significant advancements in several aspects:

Scale: It is one of the largest datasets, both in terms of object categories and the number of images, greatly surpassing existing datasets.
Variability: DIOR includes images with substantial size variations, different imaging conditions, various weather scenarios, and during different seasons.
Inter-class and Intra-class Diversity: High similarity between different classes and substantial diversity within the same class make the dataset useful for developing robust object detection models.

Benchmarking State-of-the-Art Methods

The paper benchmarks several contemporary deep learning-based object detection methods on the DIOR dataset to provide baseline performances for future research. Methods evaluated include R-CNN, Faster R-CNN, YOLOv3, SSD, RetinaNet, and others. Important observations from the experimental results are:

Backbone Networks: Depth and architecture of backbone networks significantly influence performance. Deep architectures like ResNet-101 and Hourglass-104 show superior results.
Feature Pyramids: Utilizing feature pyramids, as seen in FPNs and PANet, enhances detection accuracy, particularly for objects of varying scales.
Small Object Detection: YOLOv3 exhibits strengths in detecting small objects due to its multi-scale prediction approach.

Implications and Future Directions

The introduction of the DIOR dataset is poised to facilitate research and development of more effective object detection algorithms in the domain of optical remote sensing imagery. The diverse and extensive nature of the dataset addresses several limitations present in previous datasets.

The proposed methods and benchmarking serve as a critical baseline. Future research can leverage novel training paradigms like SNIP and SNIPER, further improving detection capabilities under extreme scale variation conditions. The comprehensive survey aids researchers by summarizing the current state of the field, identifying key challenges, and providing directions for future inquiries.

The DIOR dataset and associated benchmarks will likely stimulate advancements in various applications requiring precise object detection from remote sensing data, such as urban planning, precision agriculture, and intelligent monitoring systems. Further exploration into semi-supervised and unsupervised learning methods might also expand the horizons for training robust models with less annotated data, enhancing the applicability and utility of deep learning in remote sensing.

In conclusion, this paper significantly adds to the literature by providing an extensive survey, introducing a challenging and valuable dataset, and setting benchmarks to guide future research endeavors in the field of object detection in optical remote sensing images.

PDF Markdown