LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using Multi-Scale Convolution Attention (2301.04275v3)
Abstract: LiDAR-based semantic segmentation is critical in the fields of robotics and autonomous driving as it provides a comprehensive understanding of the scene. This paper proposes a lightweight and efficient projection-based semantic segmentation network called LENet with an encoder-decoder structure for LiDAR-based semantic segmentation. The encoder is composed of a novel multi-scale convolutional attention (MSCA) module with varying receptive field sizes to capture features. The decoder employs an Interpolation And Convolution (IAC) mechanism utilizing bilinear interpolation for upsampling multi-resolution feature maps and integrating previous and current dimensional features through a single convolution layer. This approach significantly reduces the network's complexity while also improving its accuracy. Additionally, we introduce multiple auxiliary segmentation heads to further refine the network's accuracy. Extensive evaluations on publicly available datasets, including SemanticKITTI, SemanticPOSS, and nuScenes, show that our proposed method is lighter, more efficient, and robust compared to state-of-the-art semantic segmentation methods. Full implementation is available at https://github.com/fengluodb/LENet.
- J. Behley, M. Garbade, A. Milioto, J. Quenzel, S. Behnke, C. Stachniss, and J. Gall, “Semantickitti: A dataset for semantic scene understanding of lidar sequences,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9297–9307.
- C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
- C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” arXiv preprint arXiv:1706.02413, 2017.
- Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen, “Pointcnn: Convolution on χ𝜒\chiitalic_χ-transformed points,” in Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, pp. 828–838.
- Q. Hu, B. Yang, L. Xie, S. Rosa, Y. Guo, Z. Wang, N. Trigoni, and A. Markham, “Randla-net: Efficient semantic segmentation of large-scale point clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 108–11 117.
- A. Milioto, I. Vizzo, J. Behley, and C. Stachniss, “Rangenet++: Fast and accurate lidar semantic segmentation,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019, pp. 4213–4220.
- H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” arXiv preprint arXiv:1903.11027, 2019.
- Y. Pan, B. Gao, J. Mei, S. Geng, C. Li, and H. Zhao, “Semanticposs: A point cloud dataset with large quantity of dynamic instances,” in 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2020, pp. 687–693.
- H. Thomas, C. R. Qi, J.-E. Deschaud, B. Marcotegui, F. Goulette, and L. J. Guibas, “Kpconv: Flexible and deformable convolution for point clouds,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6411–6420.
- C. Choy, J. Gwak, and S. Savarese, “4d spatio-temporal convnets: Minkowski convolutional neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3075–3084.
- X. Zhu, H. Zhou, T. Wang, F. Hong, Y. Ma, W. Li, H. Li, and D. Lin, “Cylindrical and asymmetrical 3d convolution networks for lidar segmentation,” arXiv preprint arXiv:2011.10033, 2020.
- R. Cheng, R. Razani, E. Taghavi, E. Li, and B. Liu, “2-s3net: Attentive feature fusion with adaptive feature selection for sparse semantic segmentation network,” arXiv preprint arXiv:2102.04530, 2021.
- B. Wu, A. Wan, X. Yue, and K. Keutzer, “Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud,” in 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 1887–1893.
- B. Wu, X. Zhou, S. Zhao, X. Yue, and K. Keutzer, “Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 4376–4382.
- C. Xu, B. Wu, Z. Wang, W. Zhan, P. Vajda, K. Keutzer, and M. Tomizuka, “Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation,” in European Conference on Computer Vision. Springer, 2020, pp. 1–19.
- T. Cortinhal, G. Tzelepis, and E. E. Aksoy, “Salsanext: Fast semantic segmentation of lidar point clouds for autonomous driving,” arXiv preprint arXiv:2003.03653, 2020.
- E. E. Aksoy, S. Baci, and S. Cavdar, “Salsanet: Fast road and vehicle segmentation in lidar point clouds for autonomous driving,” in 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2019, pp. 926–932.
- R. Razani, R. Cheng, E. Taghavi, and L. Bingbing, “Lite-hdseg: Lidar semantic segmentation using lite harmonic dense convolutions,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 9550–9556.
- Y. Zhao, L. Bai, and X. Huang, “Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 4453–4458.
- H. Tang, Z. Liu, S. Zhao, Y. Lin, J. Lin, H. Wang, and S. Han, “Searching efficient 3d architectures with sparse point-voxel convolution,” in ECCV. Springer, 2020, pp. 685–702.
- J. Xu, R. Zhang, J. Dou, Y. Zhu, J. Sun, and S. Pu, “Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 16 024–16 033.
- M.-H. Guo, C.-Z. Lu, Q. Hou, Z. Liu, M.-M. Cheng, and S.-M. Hu, “Segnext: Rethinking convolutional attention design for semantic segmentation,” arXiv preprint arXiv:2209.08575, 2022.
- H. Su, V. Jampani, D. Sun, S. Maji, E. Kalogerakis, M.-H. Yang, and J. Kautz, “Splatnet: Sparse lattice networks for point cloud processing,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2530–2539.
- M. Tatarchenko, J. Park, V. Koltun, and Q.-Y. Zhou, “Tangent convolutions for dense prediction in 3d,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 3887–3896.
- R. A. Rosu, P. Schütt, J. Quenzel, and S. Behnke, “Latticenet: Fast point cloud segmentation using permutohedral lattices,” arXiv preprint arXiv:1912.05905, 2019.
- S. Qiu, S. Anwar, and N. Barnes, “Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1757–1767.
- S. Li, X. Chen, Y. Liu, D. Dai, C. Stachniss, and J. Gall, “Multi-scale interaction for real-time lidar data segmentation on an embedded platform,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 738–745, 2021.
- I. Alonso, L. Riazuelo, L. Montesano, and A. C. Murillo, “3d-mininet: Learning a 2d representation from point clouds for fast and efficient 3d lidar semantic segmentation,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 5432–5439, 2020.
- S. Wang, J. Zhu, and R. Zhang, “Meta-rangeseg: Lidar sequence semantic segmentation using multiple feature aggregation,” arXiv preprint arXiv:2202.13377, 2022.
- H.-X. Cheng, X.-F. Han, and G.-Q. Xiao, “Cenet: Toward concise and efficient lidar semantic segmentation for autonomous driving,” in 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2022, pp. 01–06.
- H. Shi, G. Lin, H. Wang, T.-Y. Hung, and Z. Wang, “Spsequencenet: Semantic segmentation network on 4d point clouds,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 4574–4583.
- F. Duerr, M. Pfaller, H. Weigel, and J. Beyerer, “Lidar-based recurrent 3d semantic segmentation with temporal memory alignment,” in 2020 International Conference on 3D Vision (3DV). IEEE, 2020, pp. 781–790.
- P. Schutt, R. A. Rosu, and S. Behnke, “Abstract flow for temporal semantic segmentation on the permutohedral lattice,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 5139–5145.
- S. Li, X. Chen, Y. Liu, D. Dai, C. Stachniss, and J. Gall, “Multi-scale interaction for real-time lidar data segmentation on an embedded platform,” IEEE Robotics Autom. Lett., 2021.
- A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017.