Remote Sensing Image Scene Classification Meets Deep Learning: Challenges, Methods, Benchmarks, and Opportunities (2005.01094v2)

Published 3 May 2020 in cs.CV

Abstract: Remote sensing image scene classification, which aims at labeling remote sensing images with a set of semantic categories based on their contents, has broad applications in a range of fields. Propelled by the powerful feature learning capabilities of deep neural networks, remote sensing image scene classification driven by deep learning has drawn remarkable attention and achieved significant breakthroughs. However, to the best of our knowledge, a comprehensive review of recent achievements regarding deep learning for scene classification of remote sensing images is still lacking. Considering the rapid evolution of this field, this paper provides a systematic survey of deep learning methods for remote sensing image scene classification by covering more than 160 papers. To be specific, we discuss the main challenges of remote sensing image scene classification and survey (1) Autoencoder-based remote sensing image scene classification methods, (2) Convolutional Neural Network-based remote sensing image scene classification methods, and (3) Generative Adversarial Network-based remote sensing image scene classification methods. In addition, we introduce the benchmarks used for remote sensing image scene classification and summarize the performance of more than two dozen of representative algorithms on three commonly-used benchmark data sets. Finally, we discuss the promising opportunities for further research.

PDF Abstract

Remote Sensing Image Scene Classification Meets Deep Learning

The paper presents an extensive survey of remote sensing image scene classification with a focus on the impact of deep learning methodologies. Remote sensing imagery is an invaluable data source for observing and measuring various structures on Earth. Traditionally, the interpretation of these images has been a challenging task due to their complexity and the need for semantic categorization of different scenes.

Key Challenges

The paper begins by identifying the main challenges in remote sensing image scene classification:

Intraclass Diversity: Variations in appearance within the same category, such as differences in style, shape, and scale, create substantial challenges.
Interclass Similarity: Shared features between different categories can result in low class separability.
Scale Variance: Objects can appear in various sizes due to differences in image capture altitudes.
Multiple Objects Coexistence: A single image may contain multiple distinct objects, complicating semantic labeling.

Deep Learning Approaches

The authors categorize deep learning methods into three main approaches:

Autoencoder-Based Methods: These unsupervised models aim to learn mid-level feature representations automatically from data. Stacked autoencoders have been utilized to enhance feature discrimination, though they often fall short due to limited use of labeled data.
CNN-Based Methods: Convolutional Neural Networks, especially when fine-tuning pre-trained models or using them as feature extractors, have substantially advanced the classification metrics. Methods that design specialized CNN architectures or optimize loss functions have demonstrated remarkable performance improvements.
GAN-Based Methods: Generative Adversarial Networks have been leveraged to generate training samples and learn features in an adversarial manner, addressing the scarcity of labeled data. However, they still lag behind CNNs in direct classification tasks.

Benchmarks and Performance

Significant benchmarks like UC-Merced, AID, and NWPU-RESISC45 enable standardized evaluation of classification methods. The paper provides detailed comparisons of various algorithms across these datasets. While CNN-based approaches currently lead in performance, especially with large training datasets, there remains substantial room for improvement in unsupervised methods like GANs.

Implications and Future Directions

The paper highlights several avenues for future research:

Feature Representation: Developing methods for more discriminative and multi-scale feature learning remains critical.
Multi-Label Classification: Addressing the complexity of scenes with multiple labels could enhance real-world applicability.
Expanding Data Sets: Creating larger and more diverse datasets would support the training of robust models.
Unsupervised Learning: Greater emphasis on algorithms that require less labeled data could alleviate annotation costs.
Efficient Models: Research into compact, efficient models is necessary to deploy AI in resource-constrained environments.
Cross-Domain Generalization: Developing models that generalize across different imaging conditions or platforms presents significant potential for research.

In conclusion, this comprehensive survey underscores the profound influence of deep learning on remote sensing image scene classification while identifying ongoing challenges and prospective research directions. The field stands to benefit significantly from innovations in model efficiency and data utilization, aligning AI's capabilities more closely with human-level scene interpretation.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Gong Cheng (78 papers)
Xingxing Xie (5 papers)
Junwei Han (87 papers)
Lei Guo (110 papers)
Gui-Song Xia (139 papers)

Citations (552)

View on Semantic Scholar