Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation (2311.17491v2)

Published 29 Nov 2023 in cs.CV

Abstract: LiDAR point cloud semantic segmentation enables the robots to obtain fine-grained semantic information of the surrounding environment. Recently, many works project the point cloud onto the 2D image and adopt the 2D Convolutional Neural Networks (CNNs) or vision transformer for LiDAR point cloud semantic segmentation. However, since more than one point can be projected onto the same 2D position but only one point can be preserved, the previous 2D image-based segmentation methods suffer from inevitable quantized information loss. To avoid quantized information loss, in this paper, we propose a novel spherical frustum structure. The points projected onto the same 2D position are preserved in the spherical frustums. Moreover, we propose a memory-efficient hash-based representation of spherical frustums. Through the hash-based representation, we propose the Spherical Frustum sparse Convolution (SFC) and Frustum Fast Point Sampling (F2PS) to convolve and sample the points stored in spherical frustums respectively. Finally, we present the Spherical Frustum sparse Convolution Network (SFCNet) to adopt 2D CNNs for LiDAR point cloud semantic segmentation without quantized information loss. Extensive experiments on the SemanticKITTI and nuScenes datasets demonstrate that our SFCNet outperforms the 2D image-based semantic segmentation methods based on conventional spherical projection. Codes will be available at https://github.com/IRMVLab/SFCNet.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Building an efficient hash table on the gpu. In GPU Computing Gems Jade Edition, pages 39–53. Elsevier, 2012.
  2. Multi projection fusion for real-time semantic segmentation of 3d lidar point clouds. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1800–1809, 2021.
  3. Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. Proceedings of the IEEE conference on computer vision and pattern recognition, 2023.
  4. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9297–9307, 2019.
  5. The lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4413–4421, 2018.
  6. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020.
  7. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2017.
  8. Cenet: Toward concise and efficient lidar semantic segmentation for autonomous driving. In 2022 IEEE International Conference on Multimedia and Expo (ICME), pages 01–06. IEEE, 2022.
  9. 2-s3net: Attentive feature fusion with adaptive feature selection for sparse semantic segmentation network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12547–12556, 2021.
  10. Salsanext: Fast, uncertainty-aware semantic segmentation of lidar point clouds. In Advances in Visual Computing: 15th International Symposium, ISVC 2020, San Diego, CA, USA, October 5–7, 2020, Proceedings, Part II 15, pages 207–222. Springer, 2020.
  11. The pascal visual object classes challenge: A retrospective. International journal of computer vision, 111:98–136, 2015.
  12. 3d semantic segmentation with submanifold sparse convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9224–9232, 2018.
  13. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  14. Searching for mobilenetv3. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1314–1324, 2019.
  15. Learning semantic segmentation of large-scale point clouds with random sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  16. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and<<< 0.5 mb model size. arXiv preprint arXiv:1602.07360, 2016.
  17. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  18. Kprnet: Improving projection-based lidar semantic segmentation. arXiv preprint arXiv:2007.12668, 2020.
  19. Stratified transformer for 3d point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8500–8509, 2022.
  20. Spherical transformer for lidar-based 3d recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  21. RangeNet ++: Fast and Accurate LiDAR Semantic Segmentation. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4213–4220. IEEE, 2019.
  22. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE international conference on computer vision, pages 1520–1528, 2015.
  23. Fast point transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16949–16958, 2022.
  24. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147, 2016.
  25. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  26. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017.
  27. Lite-hdseg: Lidar semantic segmentation using lite harmonic dense convolutions. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 9550–9556. IEEE, 2021.
  28. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
  29. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6411–6420, 2019.
  30. Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1887–1893. IEEE, 2018.
  31. Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In 2019 International Conference on Robotics and Automation (ICRA), pages 4376–4382. IEEE, 2019a.
  32. Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pages 9621–9630, 2019b.
  33. Pointconvformer: Revenge of the point-based convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21802–21813, 2023.
  34. Unified perceptual parsing for scene understanding. In Proceedings of the European conference on computer vision (ECCV), pages 418–434, 2018.
  35. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34:12077–12090, 2021.
  36. Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation. In European Conference on Computer Vision, pages 1–19. Springer, 2020.
  37. Drinet: A dual-representation iterative learning network for point cloud segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 7447–7456, 2021.
  38. Polarnet: An improved grid representation for online lidar point clouds semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9601–9610, 2020.
  39. Pointweb: Enhancing local neighborhood features for point cloud processing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5565–5573, 2019.
  40. Point transformer. In Proceedings of the IEEE/CVF international conference on computer vision, pages 16259–16268, 2021a.
  41. Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4453–4458. IEEE, 2021b.
  42. Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9939–9948, 2021.

Summary

We haven't generated a summary for this paper yet.