A Survey on Object Detection in Optical Remote Sensing Images (1603.06201v2)

Published 20 Mar 2016 in cs.CV

Abstract: Object detection in optical remote sensing images, being a fundamental but challenging problem in the field of aerial and satellite image analysis, plays an important role for a wide range of applications and is receiving significant attention in recent years. While enormous methods exist, a deep review of the literature concerning generic object detection is still lacking. This paper aims to provide a review of the recent progress in this field. Different from several previously published surveys that focus on a specific object class such as building and road, we concentrate on more generic object categories including, but are not limited to, road, building, tree, vehicle, ship, airport, urban-area. Covering about 270 publications we survey 1) template matching-based object detection methods, 2) knowledge-based object detection methods, 3) object-based image analysis (OBIA)-based object detection methods, 4) machine learning-based object detection methods, and 5) five publicly available datasets and three standard evaluation metrics. We also discuss the challenges of current studies and propose two promising research directions, namely deep learning-based feature representation and weakly supervised learning-based geospatial object detection. It is our hope that this survey will be beneficial for the researchers to have better understanding of this research field.

Authors (2)

Gong Cheng (78 papers)
Junwei Han (87 papers)

Citations (1,123)

View on Semantic Scholar

Summary

Review of "A Survey on Object Detection in Optical Remote Sensing Images"

Introduction

Object detection in optical remote sensing images (RSIs) is a complex and essential task in aerial and satellite image analysis, enabling various applications such as environmental monitoring, urban planning, geographic information system (GIS) updates, and more. The survey conducted by Cheng and Han aims to provide a comprehensive examination of the progress made in this field by reviewing roughly 270 publications. Diverging from previous surveys that focus on specific object classes, this paper covers a more diverse range of object categories, including roads, buildings, trees, vehicles, ships, airports, and urban areas. The paper categorizes object detection methods into four primary types: template matching, knowledge-based, object-based image analysis (OBIA), and machine learning-based methods, and discusses five publicly available datasets and three standard evaluation metrics.

Taxonomy and Categorization

The survey segments object detection methods into four types, providing readers with a structured overview:

Template Matching-Based Methods: These methods involve generating templates that match target objects at various positions, translations, and rotations in source images. The survey differentiates between rigid and deformable template matching, highlighting their respective strengths and limitations. For instance, rigid template matching can be computationally less intensive but is sensitive to shape variances, while deformable template matching accommodates shape deformations but is computationally expensive.
Knowledge-Based Methods: Based on predefined rules and domain knowledge, these methods translate object detection problems into hypothesis testing issues. The survey emphasizes the use of geometric and context knowledge, utilizing prior knowledge about the objects' shapes and contextual relationships with their surroundings.
OBIA-Based Methods: Object-based image analysis methods consider groups of homogeneous pixels to identify meaningful objects within images. The paper outlines the two-step process of image segmentation and object classification, noting the advantages of incorporating GIS functionalities and the challenges in defining appropriate segmentation parameters.
Machine Learning-Based Methods: Current advancements in machine learning have significantly influenced object detection. The paper pays particular attention to various feature extraction methods, such as Histogram of Oriented Gradients (HOG), Bag-of-Words (BoW), texture features, sparse representation, and Haar-like features, focusing on their use in object detection. The survey also reviews several classifiers, including SVM, AdaBoost, k-NN, CRF, SRC, and ANN.

Datasets and Evaluation Metrics

The paper identifies five publicly available datasets that facilitate the evaluation and comparison of object detection methods:

NWPU VHR-10: Contains diverse objects such as airplanes, ships, storage tanks, and vehicles.
SZTAKI-INRIA Building Detection Dataset: Focuses on building identification.
TAS Aerial Car Detection Dataset: Targets vehicle detection.
OIRDS: Designed for vehicle detection algorithms.
IITM Road Extraction Dataset: Aims at road detection.

Standard evaluation metrics discussed include precision-recall curves (PRC), F-measure, and average precision (AP). These metrics provide a rigorous framework for assessing the efficacy of object detection methods.

Challenges and Future Directions

The paper outlines several challenges in current studies, such as handling the variability in object appearance, dealing with occlusions and complex backgrounds, and meeting the requirements of various application areas. To address these challenges, the authors suggest two promising research directions:

Deep Learning-Based Feature Representation: The paper notes that most existing methods rely on handcrafted or shallow learning-based features. Deep learning offers stronger feature representation capabilities, which can potentially improve the accuracy and robustness of object detection methods despite challenges like dependence on large datasets and computational intensiveness.
Weakly Supervised Learning-Based Geospatial Object Detection: Given the difficulty in obtaining adequately labeled training data, weakly supervised learning (WSL) presents a viable alternative. This approach reduces the annotation burden by using binary labels for training, and the survey highlights recent efforts demonstrating the feasibility of WSL for geospatial object detection. Future work should focus on developing WSL frameworks that can handle multiple classes and improving detection performance.

Conclusion

Cheng and Han offer an extensive survey of object detection methods in optical RSIs, categorizing existing techniques and discussing their strengths and limitations. By providing a detailed review of datasets and evaluation metrics, and suggesting future research directions, this survey serves as a valuable resource for researchers in the field. The emphasis on deep learning and weakly supervised learning points to significant areas of potential advancement, setting the stage for future innovations in object detection in optical remote sensing images.

PDF Markdown