DeepMerge: Deep-Learning-Based Region-Merging for Image Segmentation (2305.19787v2)
Abstract: Image segmentation aims to partition an image according to the objects in the scene and is a fundamental step in analysing very high spatial-resolution (VHR) remote sensing imagery. Current methods struggle to effectively consider land objects with diverse shapes and sizes. Additionally, the determination of segmentation scale parameters frequently adheres to a static and empirical doctrine, posing limitations on the segmentation of large-scale remote sensing images and yielding algorithms with limited interpretability. To address the above challenges, we propose a deep-learning-based region merging method dubbed DeepMerge to handle the segmentation of complete objects in large VHR images by integrating deep learning and region adjacency graph (RAG). This is the first method to use deep learning to learn the similarity and merge similar adjacent super-pixels in RAG. We propose a modified binary tree sampling method to generate shift-scale data, serving as inputs for transformer-based deep learning networks, a shift-scale attention with 3-Dimension relative position embedding to learn features across scales, and an embedding to fuse learned features with hand-crafted features. DeepMerge can achieve high segmentation accuracy in a supervised manner from large-scale remotely sensed images and provides an interpretable optimal scale parameter, which is validated using a remote sensing image of 0.55 m resolution covering an area of 5,660 km2. The experimental results show that DeepMerge achieves the highest F value (0.9550) and the lowest total error TE (0.0895), correctly segmenting objects of different sizes and outperforming all competing segmentation methods.
- Improved object-based convolutional neural network (iocnn) to classify very high-resolution remote sensing images, International Journal of Remote Sensing 42 (2021) 8318–8344.
- So–cnn based urban functional zone fine division with vhr remote sensing image, Remote Sensing of Environment 236 (2020) 111458.
- C. N. Mundia, M. Aniya, Analysis of land use/cover changes and urban expansion of nairobi city using remote sensing and gis, International journal of Remote sensing 26 (2005) 2831–2849.
- Building outline delineation: From aerial images to polygons with an improved end-to-end learning framework, ISPRS journal of photogrammetry and remote sensing 175 (2021) 119–131.
- Object-based large-scale terrain classification combined with segmentation optimization and terrain features: A case study in china, Transactions in GIS 25 (2021) 2939–2962.
- Urban road mapping based on an end-to-end road vectorization mapping network framework, ISPRS Journal of Photogrammetry and Remote Sensing 178 (2021) 345–365.
- Extracting planar roof structures from very high resolution images using graph neural networks, ISPRS Journal of Photogrammetry and Remote Sensing 187 (2022) 34–45.
- T. Blaschke, Object based image analysis for remote sensing, ISPRS journal of photogrammetry and remote sensing 65 (2010) 2–16.
- Fast hierarchical segmentation of high-resolution remote sensing image with adaptive edge penalty, Photogrammetric Engineering & Remote Sensing 80 (2014) 71–80.
- J.-M. Beaulieu, M. Goldberg, Hierarchy in picture segmentation: A stepwise optimization approach, IEEE Transactions on pattern analysis and machine intelligence 11 (1989) 150–163.
- Hybrid image segmentation using watersheds and fast region merging, IEEE Transactions on image processing 7 (1998) 1684–1699.
- Boundary-constrained multi-scale segmentation method for remote sensing images, ISPRS Journal of Photogrammetry and Remote Sensing 78 (2013) 15–25.
- Region merging using local spectral angle thresholds: A more accurate method for hybrid segmentation of remote sensing images, Remote sensing of environment 190 (2017) 137–148.
- Slic superpixels compared to state-of-the-art superpixel methods, IEEE transactions on pattern analysis and machine intelligence 34 (2012) 2274–2282.
- S. Paris, F. Durand, A topological approach to hierarchical segmentation using mean shift, in: 2007 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2007, pp. 1–8.
- M. Baatz, Multi resolution segmentation: an optimum approach for high quality multi scale image segmentation, in: Beutrage zum AGIT-Symposium. Salzburg, Heidelberg, 2000, 2000, pp. 12–23.
- Learning to detect natural image boundaries using local brightness, color, and texture cues, IEEE transactions on pattern analysis and machine intelligence 26 (2004) 530–549.
- Contour detection and hierarchical image segmentation, IEEE transactions on pattern analysis and machine intelligence 33 (2010) 898–916.
- Multiscale combinatorial grouping for image segmentation and object proposal generation, IEEE transactions on pattern analysis and machine intelligence 39 (2016) 128–140.
- Watershed segmentation of remotely sensed images based on a supervised fuzzy pixel classification, in: 2006 IEEE International Symposium on Geoscience and Remote Sensing, IEEE, 2006, pp. 3712–3715.
- An efficient parallel algorithm for graph-based image segmentation, in: International Conference on Computer Analysis of Images and Patterns, Springer, 2009, pp. 1003–1010.
- B. Johnson, Z. Xie, Unsupervised image segmentation evaluation and refinement using a multi-scale approach, ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 473–483.
- H.-C. Lee, D. R. Cok, Detecting boundaries in a vector field, IEEE Transactions on Signal Processing 39 (1991) 1181–1194.
- Machine learning-assisted region merging for remote sensing image segmentation, ISPRS Journal of Photogrammetry and Remote Sensing 168 (2020) 89–123.
- Optimal segmentation of a high-resolution remote-sensing image guided by area and boundary, International Journal of Remote Sensing 35 (2014) 6914–6939.
- Unsupervised segmentation parameter selection using the local spatial statistics for remote sensing image segmentation, International Journal of Applied Earth Observation and Geoinformation 81 (2019) 98–109.
- A multiscale approach to delineate dune-field landscape patches, Remote Sensing of Environment 237 (2020) 111591.
- An object-based convolutional neural network (ocnn) for urban land use classification, Remote sensing of environment 216 (2018) 57–70.
- Automated parameterisation for multi-scale image segmentation on multiple layers, ISPRS Journal of photogrammetry and Remote Sensing 88 (2014) 119–127.
- Scale parameter selection by spatial statistics for geobia: Using mean-shift based multi-scale segmentation as an example, ISPRS Journal of Photogrammetry and Remote Sensing 106 (2015) 28–41.
- Stepwise evolution analysis of the region-merging segmentation for scale parameterization, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11 (2018) 2461–2472.
- Hybrid region merging method for segmentation of high-resolution remote sensing images, ISPRS Journal of Photogrammetry and Remote Sensing 98 (2014) 19–28.
- Optimizing multiscale segmentation with local spectral heterogeneity measure for high resolution remote sensing images, ISPRS Journal of Photogrammetry and Remote Sensing 157 (2019) 13–25.
- Object-specific optimization of hierarchical multiscale segmentations for high-spatial resolution remote sensing images, ISPRS Journal of Photogrammetry and Remote Sensing 159 (2020) 308–321.
- Learning a similarity metric discriminatively, with application to face verification, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, IEEE, 2005, pp. 539–546.
- Learning dynamic siamese network for visual object tracking, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 1763–1771.
- A new method for region-based majority voting cnns for very high resolution image classification, Remote Sensing 10 (2018) 1946.
- Very high resolution remote sensing image classification with seeds-cnn and scale effect analysis for superpixel cnn classification, International Journal of Remote Sensing 40 (2019) 506–531.
- Bts: a binary tree sampling strategy for object identification based on deep learning, International journal of geographical information science 36 (2022) 822–848.
- Vivit: A video vision transformer, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 6836–6846.
- D. Hendrycks, K. Gimpel, Gaussian error linear units (gelus), arXiv preprint arXiv:1606.08415 (2016).
- Segmentation quality evaluation using region-based precision and recall measures for remote sensing images, ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 73–84.
- T. Su, S. Zhang, Local and global evaluation for remote sensing image segmentation, ISPRS Journal of Photogrammetry and Remote Sensing 130 (2017) 256–276.
- L. Yu, P. Gong, Google earth as a virtual globe tool for earth science applications at the global scale: progress and perspectives, International Journal of Remote Sensing 33 (2012) 3966–3986.
- Image segmentation based on constrained spectral variance difference and edge penalty, Remote Sensing 7 (2015) 5980–6004.
- Unsupervised simplification of image hierarchies via evolution analysis in scale-sets framework, IEEE Transactions on Image Processing 26 (2017) 2394–2407.
- U-net: Convolutional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, Springer, 2015, pp. 234–241.
- Unet++: A nested u-net architecture for medical image segmentation, in: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, Springer, 2018, pp. 3–11.
- U2-net: Going deeper with nested u-structure for salient object detection, Pattern recognition 106 (2020) 107404.
- Unetformer: A unet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery, ISPRS Journal of Photogrammetry and Remote Sensing 190 (2022) 196–214.
- Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
- Segnet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE transactions on pattern analysis and machine intelligence 39 (2017) 2481–2495.
- Semantic image segmentation with deep convolutional nets and fully connected crfs, arXiv preprint arXiv:1412.7062 (2014).
- Pyramid scene parsing network, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881–2890.
- Abcnet: Attentive bilateral contextual network for efficient semantic segmentation of fine-resolution remotely sensed imagery, ISPRS journal of photogrammetry and remote sensing 181 (2021a) 84–98.
- Multistage attention resu-net for semantic segmentation of fine-resolution remote sensing images, IEEE Geoscience and Remote Sensing Letters 19 (2021b) 1–5.
- Beyond self-attention: External attention using two linear layers for visual tasks, IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (2022) 5436–5447.
- Ccnet: Criss-cross attention for semantic segmentation, in: Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 603–612.
- Segformer: Simple and efficient design for semantic segmentation with transformers, Advances in Neural Information Processing Systems 34 (2021) 12077–12090.
- Denseaspp for semantic segmentation in street scenes, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3684–3692.
- Enet: A deep neural network architecture for real-time semantic segmentation, arXiv preprint arXiv:1606.02147 (2016).