Unsupervised Deep Feature Extraction for Remote Sensing Image Classification (1511.08131v1)

Published 25 Nov 2015 in cs.CV

Abstract: This paper introduces the use of single layer and deep convolutional networks for remote sensing data analysis. Direct application to multi- and hyper-spectral imagery of supervised (shallow or deep) convolutional networks is very challenging given the high input data dimensionality and the relatively small amount of available labeled data. Therefore, we propose the use of greedy layer-wise unsupervised pre-training coupled with a highly efficient algorithm for unsupervised learning of sparse features. The algorithm is rooted on sparse representations and enforces both population and lifetime sparsity of the extracted features, simultaneously. We successfully illustrate the expressive power of the extracted representations in several scenarios: classification of aerial scenes, as well as land-use classification in very high resolution (VHR), or land-cover classification from multi- and hyper-spectral images. The proposed algorithm clearly outperforms standard Principal Component Analysis (PCA) and its kernel counterpart (kPCA), as well as current state-of-the-art algorithms of aerial classification, while being extremely computationally efficient at learning representations of data. Results show that single layer convolutional networks can extract powerful discriminative features only when the receptive field accounts for neighboring pixels, and are preferred when the classification requires high resolution and detailed results. However, deep architectures significantly outperform single layers variants, capturing increasing levels of abstraction and complexity throughout the feature hierarchy.

Citations (674)

View on Semantic Scholar

Summary

The paper presents an unsupervised deep learning approach that extracts sparse, hierarchical features from remote sensing images.
It employs greedy layer-wise pre-training and the EPLS algorithm to efficiently capture spatial-spectral structures, outperforming conventional PCA and SVM methods.
Experimental results demonstrate substantial classification accuracy improvements across aerial, VHR, multispectral, and hyperspectral imagery, highlighting enhanced scalability.

Unsupervised Deep Feature Extraction for Remote Sensing Image Classification

This essay provides a detailed overview of the paper "Unsupervised Deep Feature Extraction for Remote Sensing Image Classification" authored by Adriana Romero, Carlo Gatta, and Gustau Camps-Valls. The paper presents an innovative approach that applies unsupervised deep convolutional networks to the domain of remote sensing, specifically for the classification of multi- and hyper-spectral images.

Context and Challenges

The classification of remote sensing images is a critical task in earth observation, often requiring high-dimensional data to be processed by a limited number of labeled examples. Traditional methods like PCA or SVM, while effective in certain contexts, struggle with the increasing complexity and dimensionality of modern remote sensing data. The paper addresses these challenges by leveraging deep learning architectures that are trained unsupervisedly, enabling the extraction of sparse and hierarchical features without the need for extensive labeled datasets.

Methodology

The proposed method employs deep convolutional networks with a focus on unsupervised learning through greedy layer-wise pre-training. This is complemented by a novel algorithm, Enforcing Lifetime and Population Sparsity (EPLS), which provides efficient and discriminative feature representations by promoting both population and lifetime sparsity. This enhances the network's ability to capture spatial-spectral structures efficiently.

Greedy Layer-wise Pre-training: This method exploits unsupervised learning to initialize network layers progressively, iterating on a simplified local criterion and allowing subsequent supervised fine-tuning or unsupervised feature extraction.
Sparse Feature Learning with EPLS: The EPLS algorithm distinguishes itself by minimizing meta-parameter dependency, offering adaptive and efficient learning of sparse features, crucial for high-dimensional remote sensing data.

Numerical Results and Comparative Analysis

Numerous experiments are conducted across different data types, including aerial, VHR, multispectral, and hyperspectral images. The outcomes demonstrate superior performance over conventional methods:

Aerial Scene Classification: The method achieves remarkable accuracy improvements over existing techniques, highlighting its capability of leveraging deep architectures to capture intricate scene details. For example, a 3-layer deep network configuration surpasses the single-layer baselines significantly.
VHR Image Classification: The experiments underline the algorithm's scalability in managing higher spatial resolutions, with substantial classification accuracy increases observed as the model depth grows.
Multispectral and Hyperspectral Image Classification: The deep convolutional networks consistently outperform traditional PCA and kPCA methods. Specifically, the paper's exploration of scalability with varying numbers of features and layers demonstrates robust performance even with limited labeled data.

Implications and Future Directions

The introduction of unsupervised deep learning for remote sensing image classification opens several avenues for future research and application. The ability to efficiently learn feature hierarchies without labeled samples is particularly beneficial, given the scarcity of annotations in remote sensing datasets. The paper suggests potential extensions such as adapting the framework for multi-temporal and multi-angular data, optimizing feature sparsity across layers, and exploring unsupervised fine-tuning methods.

The insights provided by this research mark a significant step towards more autonomous and scalable remote sensing classification systems. As the volume and complexity of data continue to grow, such methods could prove essential in extracting actionable insights with minimal human intervention.

PDF Markdown