The SEN1-2 Dataset for Deep Learning in SAR-Optical Data Fusion (1807.01569v1)

Published 4 Jul 2018 in cs.CV

Abstract: While deep learning techniques have an increasing impact on many technical fields, gathering sufficient amounts of training data is a challenging problem in remote sensing. In particular, this holds for applications involving data from multiple sensors with heterogeneous characteristics. One example for that is the fusion of synthetic aperture radar (SAR) data and optical imagery. With this paper, we publish the SEN1-2 dataset to foster deep learning research in SAR-optical data fusion. SEN1-2 comprises 282,384 pairs of corresponding image patches, collected from across the globe and throughout all meteorological seasons. Besides a detailed description of the dataset, we show exemplary results for several possible applications, such as SAR image colorization, SAR-optical image matching, and creation of artificial optical images from SAR input data. Since SEN1-2 is the first large open dataset of this kind, we believe it will support further developments in the field of deep learning for remote sensing as well as multi-sensor data fusion.

Citations (192)

View on Semantic Scholar

Summary

The paper introduces a robust dataset of 282,384 SAR-optical image pairs to enhance deep learning for multi-sensor fusion in remote sensing.
It employs rigorous data curation with random sampling, cloud analysis, and manual inspection to ensure high-quality, precisely aligned imagery.
Preliminary results, including a 93% accuracy in SAR-optical matching using pseudo-siamese CNNs, demonstrate its potential in advancing remote sensing techniques.

Overview of the SEN1-2 Dataset for Deep Learning in SAR-Optical Data Fusion

The paper "The SEN1-2 Dataset for Deep Learning in SAR-Optical Data Fusion" introduces a substantial dataset designed to advance deep learning applications in the remote sensing field, particularly focusing on the fusion of Synthetic Aperture Radar (SAR) and optical data. The SEN1-2 dataset comprises 282,384 pairs of SAR and optical image patches, offering significant support for researchers aiming to leverage deep learning techniques in multi-sensor data integration tasks.

Dataset Composition and Characteristics

The SEN1-2 dataset is composed of image patch pairs derived from Sentinel-1 SAR data and Sentinel-2 optical imagery. These patches have been meticulously gathered from various global locations and encompass all meteorological seasons, ensuring a comprehensive representation of Earth's diverse landscapes. The dataset is distinguished by its large volume and precise spatial alignment, which are crucial for many machine learning applications, particularly in tasks requiring high-quality training data with heterogeneous sensor characteristics.

Sentinel-1 provides SAR imagery that is predominantly impervious to atmospheric conditions due to its range-based imaging technique, whereas Sentinel-2 delivers optical imagery in RGB format for realistic visual interpretation. The dataset's curation involved a robust process of random sampling of regions of interest, cloud coverage analysis, mosaicking in Google Earth Engine, and manual inspection to eliminate patches affected by artifacts and clouds. These steps ensured the dataset's high quality and usability.

Implications and Future Applications

The SEN1-2 dataset holds profound implications for advancing SAR-optical data fusion methodologies through machine learning. The dataset supports various exploratory applications in remote sensing, such as SAR image colorization, multi-modal image matching, and generation of artificial optical images from SAR inputs. For instance, the paper reports promising results in SAR-optical image matching, achieving an accuracy of 93% using pseudo-siamese convolutional neural networks. Furthermore, generative models, including generative adversarial networks, trained on the SEN1-2 dataset demonstrate capabilities in producing synthetic optical imagery from SAR data.

From a theoretical perspective, the SEN1-2 dataset promotes advancements in the understanding and development of algorithms capable of bridging the radiometric and geometric gaps between SAR and optical modalities. Practically, it fosters developments in areas such as environmental monitoring, urban planning, and disaster management, where multi-sensor data fusion offers enhanced information retrieval capabilities.

Future Developments

The paper suggests potential expansions for the dataset, such as the inclusion of multi-spectral information and atmospheric corrections for Sentinel-2 images. These enhancements would offer a more comprehensive resource for LULC classification tasks and other detailed analyses requiring extended spectral data.

Overall, the SEN1-2 dataset presents a critical resource for remote sensing and machine learning researchers. By providing a large quantity of high-quality, co-registered SAR-optical data, it lays the groundwork for future innovations in multi-sensor fusion, demonstrating the synergistic potential of SAR and optical imagery in addressing complex geospatial challenges.

PDF Markdown