Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LOANet: A Lightweight Network Using Object Attention for Extracting Buildings and Roads from UAV Aerial Remote Sensing Images (2212.08490v6)

Published 16 Dec 2022 in cs.CV

Abstract: Semantic segmentation for extracting buildings and roads from uncrewed aerial vehicle (UAV) remote sensing images by deep learning becomes a more efficient and convenient method than traditional manual segmentation in surveying and mapping fields. In order to make the model lightweight and improve the model accuracy, a Lightweight Network Using Object Attention (LOANet) for Buildings and Roads from UAV Aerial Remote Sensing Images is proposed. The proposed network adopts an encoder-decoder architecture in which a Lightweight Densely Connected Network (LDCNet) is developed as the encoder. In the decoder part, the dual multi-scale context modules which consist of the Atrous Spatial Pyramid Pooling module (ASPP) and the Object Attention Module (OAM) are designed to capture more context information from feature maps of UAV remote sensing images. Between ASPP and OAM, a Feature Pyramid Network (FPN) module is used to fuse multi-scale features extracted from ASPP. A private dataset of remote sensing images taken by UAV which contains 2431 training sets, 945 validation sets, and 475 test sets is constructed. The proposed basic model performs well on this dataset, with only 1.4M parameters and 5.48G floating point operations (FLOPs), achieving excellent mean Intersection-over-Union (mIoU). Further experiments on the publicly available LoveDA and CITY-OSM datasets have been conducted to further validate the effectiveness of the proposed basic and large model, and outstanding mIoU results have been achieved. All codes are available on https://github.com/GtLinyer/LOANet.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Layer normalization. arXiv preprint arXiv:1607.06450.
  2. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12):2481–2495.
  3. Deep learning-based multi-feature semantic segmentation in building extraction from images of uav photogrammetry. International Journal of Remote Sensing, 42(1):1–19.
  4. Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In Proceedings of the IEEE/CVF international conference on computer vision workshops, pages 0–0.
  5. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062.
  6. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848.
  7. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587.
  8. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818.
  9. Research on a novel extraction method using deep learning based on gf-2 images for aquaculture areas. International Journal of Remote Sensing, 41(9):3575–3591.
  10. Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11963–11975.
  11. Classification for high resolution remote sensing imagery using a fully convolutional network. Remote Sensing, 9(5):498.
  12. Generative adversarial networks. Communications of the ACM, 63(11):139–144.
  13. Visual attention network. arXiv preprint arXiv:2202.09741.
  14. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1580–1589.
  15. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
  16. Searching for mobilenetv3. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1314–1324.
  17. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.
  18. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708.
  19. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size. arXiv preprint arXiv:1602.07360.
  20. Ioffe, S. (2017). Batch renormalization: Towards reducing minibatch dependence in batch-normalized models. Advances in neural information processing systems, 30.
  21. Learning aerial image segmentation from online maps. IEEE Transactions on Geoscience and Remote Sensing, 55(11):6054–6068.
  22. Kniaz, V. V. (2019). Deep learning for dense labeling of hydrographic regions in very high resolution imagery. In Image and signal processing for remote sensing XXV, volume 11155, pages 283–292. SPIE.
  23. Deepunet: A deep fully convolutional network for pixel-level sea-land segmentation. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11(11):3954–3962.
  24. A2-fpn for semantic segmentation of fine-resolution remotely sensed images. International journal of remote sensing, 43(3):1131–1155.
  25. Multiattention network for semantic segmentation of fine-resolution remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 60:1–13.
  26. Micronet: Improving image recognition with extremely low flops. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 468–477.
  27. Road extraction from unmanned aerial vehicle remote sensing images based on improved neural networks. Sensors, 19(19):4115.
  28. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125.
  29. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988.
  30. Accurate building extraction from fused dsm and uav images using a chain fully convolutional neural network. Remote Sensing, 11(24):2912.
  31. Hourglass-shapenetwork based semantic segmentation for high resolution aerial imagery. Remote Sensing, 9(6):522.
  32. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022.
  33. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11976–11986.
  34. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440.
  35. Fixing weight decay regularization in adam.
  36. Convolutional neural networks for large-scale remote-sensing image classification. IEEE Transactions on geoscience and remote sensing, 55(2):645–657.
  37. Mnih, V. (2013). Machine learning for aerial image labeling. University of Toronto (Canada).
  38. A review on deep learning in uav remote sensing. International Journal of Applied Earth Observation and Geoinformation, 102:102456.
  39. Large kernel matters–improve semantic segmentation by global convolutional network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4353–4361.
  40. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer.
  41. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520.
  42. Intelligent object recognition of urban water bodies based on deep learning for multi-source and multi-temporal high spatial resolution remote sensing imagery. Sensors, 20(2):397.
  43. Mixer u-net: An improved automatic road extraction from uav imagery. Applied Sciences, 12(4):1953.
  44. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9.
  45. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR.
  46. Loveda: A remote sensing land-cover dataset for domain adaptive semantic segmentation. arXiv preprint arXiv:2110.08733.
  47. Building extraction with vision transformer. IEEE Transactions on Geoscience and Remote Sensing, 60:1–11.
  48. A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images. IEEE Geoscience and Remote Sensing Letters, 19:1–5.
  49. Swin transformer based on two-fold loss and background adaptation re-ranking for person re-identification. Electronics, 11(13):1941.
  50. Ddu-net: dual-decoder-u-net for road extraction using high-resolution remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 60:1–12.
  51. Water areas segmentation from remote sensing images using a separable residual segnet network. ISPRS International Journal of Geo-Information, 9(4):256.
  52. Rethinking” batch” in batchnorm. arXiv preprint arXiv:2105.07576.
  53. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34:12077–12090.
  54. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492–1500.
  55. Transformer and cnn hybrid deep neural network for semantic segmentation of very-high-resolution remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 60:1–20.
  56. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2881–2890.
Citations (8)

Summary

We haven't generated a summary for this paper yet.