Learning Low Dimensional Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval (1610.03023v2)

Published 10 Oct 2016 in cs.CV

Abstract: Learning powerful feature representations for image retrieval has always been a challenging task in the field of remote sensing. Traditional methods focus on extracting low-level hand-crafted features which are not only time-consuming but also tend to achieve unsatisfactory performance due to the content complexity of remote sensing images. In this paper, we investigate how to extract deep feature representations based on convolutional neural networks (CNN) for high-resolution remote sensing image retrieval (HRRSIR). To this end, two effective schemes are proposed to generate powerful feature representations for HRRSIR. In the first scheme, the deep features are extracted from the fully-connected and convolutional layers of the pre-trained CNN models, respectively; in the second scheme, we propose a novel CNN architecture based on conventional convolution layers and a three-layer perceptron. The novel CNN model is then trained on a large remote sensing dataset to learn low dimensional features. The two schemes are evaluated on several public and challenging datasets, and the results indicate that the proposed schemes and in particular the novel CNN are able to achieve state-of-the-art performance.

PDF Abstract

Learning Low Dimensional Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval: An Analysis

Remote sensing image retrieval necessitates the efficient extraction and representation of image features, a challenge exacerbated by the complexity and sheer volume of high-resolution remote sensing (HRRS) imagery. Traditional approaches often rely on low-level hand-crafted features, such as spectral, shape, and texture, which fall short due to their labor-intensive nature and limited scope in handling multifaceted remote sensing data. This paper by Zhou et al. advances the field by leveraging convolutional neural networks (CNNs) to develop deep feature representations for HRRS image retrieval.

Key Contributions

The paper introduces two pivotal schemes:

The application of pre-trained CNN models for feature extraction.
An innovative CNN architecture designed to generate low-dimensional features, combining conventional convolution layers with a three-layer perceptron.

Through the use of CNNs, the authors circumvent the limitations of handcrafted features, facilitating the extraction of both global features from fully-connected layers and local features from convolutional layers. The paper explicitly focuses on evaluating the efficacy of these schemes across multiple challenging remote sensing datasets, namely UC Merced, WHU-RS, RSSCN7, and AID.

Numerical Results and Findings

The paper presents a comprehensive performance evaluation of several CNN models including AlexNet, CaffeRef, VGG variants, and VD16/VD19, which are pre-trained on ImageNet. The evaluation is measured using metrics such as average normalized modified retrieval rank (ANMRR) and mean average precision (mAP).

ANMRR and mAP Performance: On the UC Merced dataset, VGGM's Fc2 feature achieved an ANMRR of 0.378 and an mAP of 0.5444, outperforming other models. Meanwhile, CaffeRef's Fc2 features excelled in the RSD dataset with an ANMRR of 0.283 and an mAP of 0.6460.
Impact of Feature Aggregation Methods: Feature aggregation techniques like BOVW, VLAD, and IFK were employed for convolutional layer outputs. Notably, VD16_IFK demonstrated robust results with an ANMRR of 0.407 on UCMD dataset, showcasing the efficiency of encoding local descriptors into compact representations.

The innovative CNN architecture proposed in the second scheme, named low-dimensional CNN (LDCNN), showed superior performance on RSD and RSSCN7 datasets, indicating that low-dimensional features can be compact yet powerful for image retrieval.

Theoretical and Practical Implications

The paper bridges the gap between conventional remote sensing feature extraction methods and modern deep learning techniques. By demonstrating CNN's applicability for HRRSIR, it affirms the model's capability to generalize across different datasets and highlights transfer learning as a practical strategy in resource-limited scenarios. The proposed LDCNN not only reduces model complexity but also enhances retrieval efficiency, making it particularly relevant for large-scale remote sensing applications.

Future Directions

The exploration of deep learning in HRRS image retrieval is poised for growth. Future work may delve into further optimizing CNN architectures for remote sensing tasks, exploring the utilization of generative models for data augmentation, and enhancing transferability across more diverse remote sensing datasets. Addressing potential scalability issues by constructing larger benchmark datasets akin to the Terrapattern project can also enrich model training and evaluation.

The comprehensive examination and promising results presented in this paper pave the way for future research in applying advanced deep learning frameworks to remote sensing, offering valuable insights for both practical implementations and theoretical advancements in image retrieval systems.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Weixun Zhou (3 papers)
Shawn Newsam (30 papers)
Congmin Li (2 papers)
Zhenfeng Shao (15 papers)

Citations (180)

View on Semantic Scholar