PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval (1706.03424v2)

Published 11 Jun 2017 in cs.CV

Abstract: Remote sensing image retrieval(RSIR), which aims to efficiently retrieve data of interest from large collections of remote sensing data, is a fundamental task in remote sensing. Over the past several decades, there has been significant effort to extract powerful feature representations for this task since the retrieval performance depends on the representative strength of the features. Benchmark datasets are also critical for developing, evaluating, and comparing RSIR approaches. Current benchmark datasets are deficient in that 1) they were originally collected for land use/land cover classification and not image retrieval, 2) they are relatively small in terms of the number of classes as well the number of sample images per class, and 3) the retrieval performance has saturated. These limitations have severely restricted the development of novel feature representations for RSIR, particularly the recent deep-learning based features which require large amounts of training data. We therefore present in this paper, a new large-scale remote sensing dataset termed "PatternNet" that was collected specifically for RSIR. PatternNet was collected from high-resolution imagery and contains 38 classes with 800 images per class. We also provide a thorough review of RSIR approaches ranging from traditional handcrafted feature based methods to recent deep learning based ones. We evaluate over 35 methods to establish extensive baseline results for future RSIR research using the PatternNet benchmark.

Authors (4)

Weixun Zhou (3 papers)
Shawn Newsam (30 papers)
Congmin Li (2 papers)
Zhenfeng Shao (15 papers)

Citations (406)

View on Semantic Scholar

Summary

The paper introduces the PatternNet dataset, filling a gap in RSIR benchmarks with 38 classes and 800 images per class for deep learning research.
It evaluates both handcrafted and CNN-based methods, demonstrating superior retrieval accuracy and extraction efficiency with models like ResNet.
The study highlights that comprehensive datasets bolster improved feature extraction and algorithm refinement for scalable remote sensing image retrieval.

An Evaluation of Remote Sensing Image Retrieval with the PatternNet Benchmark

The paper introduces PatternNet, a substantial remote sensing image retrieval (RSIR) dataset curated expressly for evaluating the performance of diverse retrieval methodologies. Remote sensing imagery has enhanced in spatial resolution and data acquisition rates, posing significant challenges in data management and retrieval. The core objective of RSIR is to efficiently obtain relevant data from expansive remote-sensing collections, where retrieval performance is intricately linked to the potency of feature representations.

Limitations of Current Datasets and the Need for PatternNet

Existing benchmark datasets are often inadequate for RSIR, as they were primarily curated for land use/land cover classification and consist of a limited number of classes and sample images. This deficiency stymies the development and evaluation of new feature representations, particularly those rooted in deep learning that necessitate extensive amounts of data.

PatternNet fills this gap by offering 38 classes with 800 images per class, providing a diverse and large-scale dataset suitable for deep-learning approaches while mitigating limitations observed in existing datasets like UC Merced, AID, and NWPU-RESISC45. These existing datasets either fall short in their number of classes and images or are misaligned with the goals of RSIR due to varying resolutions and voluminous background content.

Evaluation of Methods Using PatternNet

An extensive review of RSIR methodologies is conducted in conjunction with an evaluation on the new dataset. The methods are broadly categorized into:

Handcrafted Feature-Based Methods:
- Low-Level Features: These include global features like color histograms, Gabor textures, and local features such as LBP and PHOG. While efficient, these methods show limited adaptability across varying scenarios due to their heuristic origins.
- Mid-Level Features: Techniques like bag of visual words (BOVW), VLAD, and IFK transform local descriptors such as SIFT into mid-level representations. Among these, VLAD and IFK outperform BOVW in terms of discriminative power but come with high storage overheads.
Deep Learning Feature-Based Methods:
- Unsupervised and CNN-Based Features: Deep learning techniques, notably CNNs, demonstrate superior feature representation capabilities. The investigation highlighted methodologies like AlexNet and ResNet, with the latter emerging as a superior performer in extraction efficiency and retrieval accuracy. Moreover, low-dimensional CNNs (LDCNN) offer promising compact feature representations suited for large-scale RSIR.

Significance and Future Directions

PatternNet establishes a significant benchmark for RSIR, fostering the development of novel algorithms, especially those based on deep learning. It addresses the gaps presented by prior datasets, providing ample data volume and class diversity to support the iterative refinement of RSIR methods.

This paper underscores the transition towards more sophisticated methodologies in RSIR, advocating for deeper and more comprehensive feature extraction capabilities inherent in deep learning models. While progress is considerable, challenges related to large-scale labeled datasets and computational efficiency persist. Future research should focus on enhancing supervised learning paradigms, leveraging transfer learning more effectively, and finding a balance between feature extraction efficacy and efficiency.

By consolidating evaluation standards through PatternNet, future work may more reliably investigate the potential of deep learning applications in remote sensing, thereby advancing the domain towards more accurate and scalable image retrieval solutions.