Papers
Topics
Authors
Recent
Search
2000 character limit reached

Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation

Published 3 Sep 2024 in cs.CV | (2409.01662v1)

Abstract: Expanding the receptive field in a deep learning model for large-scale 3D point cloud segmentation is an effective technique for capturing rich contextual information, which consequently enhances the network's ability to learn meaningful features. However, this often leads to increased computational complexity and risk of overfitting, challenging the efficiency and effectiveness of the learning paradigm. To address these limitations, we propose the Local Split Attention Pooling (LSAP) mechanism to effectively expand the receptive field through a series of local split operations, thus facilitating the acquisition of broader contextual knowledge. Concurrently, it optimizes the computational workload associated with attention-pooling layers to ensure a more streamlined processing workflow. Based on LSAP, a Parallel Aggregation Enhancement (PAE) module is introduced to enable parallel processing of data using both 2D and 3D neighboring information to further enhance contextual representations within the network. In light of the aforementioned designs, we put forth a novel framework, designated as LSNet, for large-scale point cloud semantic segmentation. Extensive evaluations demonstrated the efficacy of seamlessly integrating the proposed PAE module into existing frameworks, yielding significant improvements in mean intersection over union (mIoU) metrics, with a notable increase of up to 11%. Furthermore, LSNet demonstrated superior performance compared to state-of-the-art semantic segmentation networks on three benchmark datasets, including S3DIS, Toronto3D, and SensatUrban. It is noteworthy that our method achieved a substantial speedup of approximately 38.8% compared to those employing similar-sized receptive fields, which serves to highlight both its computational efficiency and practical utility in real-world large-scale scenes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Deep learning for lidar point clouds in autonomous driving: A review, IEEE Transactions on Neural Networks and Learning Systems 32 (2020) 3412–3432.
  2. Point cloud modeling as a bridge between landscape design and planning, Landscape and Urban Planning 203 (2020) 103903.
  3. Lidar boosts 3d ecological observations and modelings: A review and perspective, IEEE Geoscience and Remote Sensing Magazine 9 (2020a) 232–257.
  4. Deep learning for 3d point clouds: A survey, IEEE transactions on pattern analysis and machine intelligence 43 (2020b) 4338–4364.
  5. Semantic texton forests for image categorization and segmentation, in: 2008 IEEE conference on computer vision and pattern recognition, IEEE, 2008, pp. 1–8.
  6. Deep learning for image and point cloud fusion in autonomous driving: A review, IEEE Transactions on Intelligent Transportation Systems 23 (2021) 722–739.
  7. Deep projective 3d semantic segmentation, arXiv: Computer Vision and Pattern Recognition,arXiv: Computer Vision and Pattern Recognition (2017).
  8. Unstructured point cloud semantic labeling using deep segmentation networks., Eurographics,Eurographics (2017).
  9. Tangent convolutions for dense prediction in 3d, Cornell University - arXiv,Cornell University - arXiv (2018).
  10. Ellipsoidnet: Ellipsoid representation for point cloud classification and segmentation, in: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022. URL: http://dx.doi.org/10.1109/wacv51458.2022.00033. doi:10.1109/wacv51458.2022.00033.
  11. Pointmcd: Boosting deep point cloud encoders via multi-view cross-modal distillation for 3d shape recognition, IEEE Transactions on Multimedia (2023).
  12. J. Huang, S. You, Point cloud labeling using 3d convolutional neural network, in: 2016 23rd International Conference on Pattern Recognition (ICPR), IEEE, 2016, pp. 2670–2675.
  13. 3d u-net: learning dense volumetric segmentation from sparse annotation, in: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, October 17-21, 2016, Proceedings, Part II 19, Springer, 2016, pp. 424–432.
  14. Octnet: Learning deep 3d representations at high resolutions, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. URL: http://dx.doi.org/10.1109/cvpr.2017.701. doi:10.1109/cvpr.2017.701.
  15. Msnet: Multi-scale convolutional network for point cloud classification, Remote Sensing (2018) 612.
  16. Mvpnet: A multi-scale voxel-point adaptive fusion network for point cloud semantic segmentation in urban scenes, International Journal of Applied Earth Observation and Geoinformation 122 (2023) 103391.
  17. Pointnet: Deep learning on point sets for 3d classification and segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017a, pp. 652–660.
  18. Pointnet++: Deep hierarchical feature learning on point sets in a metric space, Advances in neural information processing systems 30 (2017b).
  19. Dynamic graph cnn for learning on point clouds, Acm Transactions On Graphics (tog) 38 (2019) 1–12.
  20. Linked dynamic graph cnn: Learning through point cloud by linking hierarchical features, in: 2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), 2021, pp. 7–12. doi:10.1109/M2VIP49856.2021.9665104.
  21. Pointnext: Revisiting pointnet++ with improved training and scaling strategies, Advances in Neural Information Processing Systems 35 (2022) 23192–23204.
  22. Randla-net: Efficient semantic segmentation of large-scale point clouds, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11108–11117.
  23. Neiea-net: Semantic segmentation of large-scale point cloud scene via neighbor enhancement and aggregation, International Journal of Applied Earth Observation and Geoinformation 119 (2023) 103285.
  24. Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion, in: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. URL: http://dx.doi.org/10.1109/cvpr46437.2021.00180. doi:10.1109/cvpr46437.2021.00180.
  25. Large-scale point cloud semantic segmentation via local perception and global descriptor vector, Expert Systems with Applications 246 (2024) 123269.
  26. Flattening-net: Deep regular 2d representation for 3d point cloud analysis (2022).
  27. Q. Zhang, J. Hou, Self-supervised pre-training for 3d point clouds via view-specific point-to-image translation (2022).
  28. 3d semantic segmentation with submanifold sparse convolutional networks, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. URL: http://dx.doi.org/10.1109/cvpr.2018.00961. doi:10.1109/cvpr.2018.00961.
  29. Cylinder3d: An effective 3d framework for driving-scene lidar semantic segmentation., arXiv: Computer Vision and Pattern Recognition,arXiv: Computer Vision and Pattern Recognition (2020).
  30. (af)2-s3net: Attentive feature fusion with adaptive feature selection for sparse semantic segmentation network, Cornell University - arXiv,Cornell University - arXiv (2021).
  31. Drinet++: Efficient voxel-as-point point cloud segmentation, arXiv: Computer Vision and Pattern Recognition,arXiv: Computer Vision and Pattern Recognition (2021).
  32. Kpconv: Flexible and deformable convolution for point clouds, in: Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6411–6420.
  33. D. Kingma, J. Ba, Adam: a method for stochastic optimization. arxiv prepr, arXiv preprint arXiv:1412.6980 (2015).
  34. {{\{{TensorFlow}}\}}: a system for {{\{{Large-Scale}}\}} machine learning, in: 12th USENIX symposium on operating systems design and implementation (OSDI 16), 2016, pp. 265–283.
  35. Joint 2d-3d-semantic data for indoor scene understanding, arXiv preprint arXiv:1702.01105 (2017).
  36. Toronto-3d: A large-scale mobile lidar dataset for semantic segmentation of urban roadways, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2020, pp. 202–203.
  37. Shellnet: Efficient point cloud convolutional neural networks using concentric shells statistics, in: Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1607–1616.
  38. Pointweb: Enhancing local neighborhood features for point cloud processing, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5565–5573.
  39. L. Landrieu, M. Boussaha, Point cloud oversegmentation with graph-structured deep metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7440–7449.
  40. Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 5589–5598.
  41. Scf-net: Learning spatial contextual features for large-scale point cloud segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14504–14513.
  42. Contrastive boundary learning for point cloud segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 8489–8499.
  43. Fa-resnet: Feature affine residual network for large-scale point cloud segmentation, International Journal of Applied Earth Observation and Geoinformation 118 (2023) 103259.
  44. A large-scale point cloud semantic segmentation network via local dual features and global correlations, Computers & Graphics 111 (2023) 133–144.
  45. Towards semantic segmentation of urban-scale 3d point clouds: A dataset, benchmarks and challenges, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 4977–4987.
  46. Resdlps-net: Joint residual-dense optimization for large-scale point cloud semantic segmentation, ISPRS Journal of Photogrammetry and Remote Sensing 182 (2021) 37–51.
  47. Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1757–1767.
  48. Backward attentive fusing network with local aggregation classifier for 3d point cloud semantic segmentation, IEEE Transactions on Image Processing 30 (2021) 4973–4984.
  49. Rg-gcn: A random graph based on graph convolution network for point cloud semantic segmentation, Remote Sensing 14 (2022) 4055.
  50. L. Landrieu, M. Simonovsky, Large-scale point cloud semantic segmentation with superpoint graphs, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4558–4567.
  51. 3d semantic segmentation with submanifold sparse convolutional networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9224–9232.
  52. Learning adaptive receptive fields for deep image parsing network, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2434–2442.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.