Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Filling Missing Values Matters for Range Image-Based Point Cloud Segmentation (2405.10175v2)

Published 16 May 2024 in cs.CV and cs.RO

Abstract: Point cloud segmentation (PCS) plays an essential role in robot perception and navigation tasks. To efficiently understand large-scale outdoor point clouds, their range image representation is commonly adopted. This image-like representation is compact and structured, making range image-based PCS models practical. However, undesirable missing values in the range images damage the shapes and patterns of objects. This problem creates difficulty for the models in learning coherent and complete geometric information from the objects. Consequently, the PCS models only achieve inferior performance. Delving deeply into this issue, we find that the use of unreasonable projection approaches and deskewing scans mainly leads to unwanted missing values in the range images. Besides, almost all previous works fail to consider filling in the unexpected missing values in the PCS task. To alleviate this problem, we first propose a new projection method, namely scan unfolding++ (SU++), to avoid massive missing values in the generated range images. Then, we introduce a simple yet effective approach, namely range-dependent $K$-nearest neighbor interpolation ($K$NNI), to further fill in missing values. Finally, we introduce the Filling Missing Values Network (FMVNet) and Fast FMVNet. Extensive experimental results on SemanticKITTI, SemanticPOSS, and nuScenes datasets demonstrate that by employing the proposed SU++ and $K$NNI, existing range image-based PCS models consistently achieve better performance than the baseline models. Besides, both FMVNet and Fast FMVNet achieve state-of-the-art performance in terms of the speed-accuracy trade-off. The proposed methods can be applied to other range image-based tasks and practical applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. D. Zhou, J. Fang, X. Song, L. Liu, J. Yin, Y. Dai, H. Li, and R. Yang, “Joint 3d instance segmentation and object detection for autonomous driving,” in Computer Vision and Pattern Recognition, 2020, pp. 1839–1849.
  2. X. Chen, A. Milioto, E. Palazzolo, P. Giguère, J. Behley, and C. Stachniss, “Suma++: Efficient lidar-based semantic slam,” in International Conference on Intelligent Robots and Systems, 2019, pp. 4530–4537.
  3. J. Behley, M. Garbade, A. Milioto, J. Quenzel, S. Behnke, C. Stachniss, and J. Gall, “Semantickitti: A dataset for semantic scene understanding of lidar sequences,” in International Conference on Computer Vision, 2019, pp. 9297–9307.
  4. Y. Pan, B. Gao, J. Mei, S. Geng, C. Li, and H. Zhao, “Semanticposs: A point cloud dataset with large quantity of dynamic instances,” in IEEE Intelligent Vehicles Symposium, 2020, pp. 687–693.
  5. W. K. Fong, R. Mohan, J. V. Hurtado, L. Zhou, H. Caesar, O. Beijbom, and A. Valada, “Panoptic nuscenes: A large-scale benchmark for lidar panoptic segmentation and tracking,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3795–3802, 2022.
  6. B. Wu, X. Zhou, S. Zhao, X. Yue, and K. Keutzer, “Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud,” in International Conference on Robotics and Automation, 2019, pp. 4376–4382.
  7. C. Xu, B. Wu, Z. Wang, W. Zhan, P. Vajda, K. Keutzer, and M. Tomizuka, “Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation,” in European Conference on Computer Vision, 2020, pp. 1–19.
  8. Y. Zhao, L. Bai, and X. Huang, “Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding,” in International Conference on Intelligent Robots and Systems, 2021, pp. 4453–4458.
  9. A. Milioto, I. Vizzo, J. Behley, and C. Stachniss, “Rangenet ++: Fast and accurate lidar semantic segmentation,” in International Conference on Intelligent Robots and Systems, 2019, pp. 4213–4220.
  10. H. Cheng, X. Han, and G. Xiao, “Cenet: Toward concise and efficient lidar semantic segmentation for autonomous driving,” in International Conference on Multimedia and Expo, 2022, pp. 01–06.
  11. A. Ando, S. Gidaris, A. Bursuc, G. Puy, A. Boulch, and R. Marlet, “Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving,” in Computer Vision and Pattern Recognition, 2023, pp. 5240–5250.
  12. L. Kong, Y. Liu, R. Chen, Y. Ma, X. Zhu, Y. Li, Y. Hou, Y. Qiao, and Z. Liu, “Rethinking range view representation for lidar segmentation,” in International Conference on Computer Vision, 2023, pp. 228–240.
  13. Q. Hu, B. Yang, L. Xie, S. Rosa, Y. Guo, Z. Wang, N. Trigoni, and A. Markham, “Randla-net: Efficient semantic segmentation of large-scale point clouds,” in Computer Vision and Pattern Recognition, 2020, pp. 11 108–11 117.
  14. C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: deep hierarchical feature learning on point sets in a metric space,” in Neural Information Processing Systems, 2017, p. 5105–5114.
  15. X. Zhu, H. Zhou, T. Wang, F. Hong, Y. Ma, W. Li, H. Li, and D. Lin, “Cylindrical and asymmetrical 3d convolution networks for lidar segmentation,” in Computer Vision and Pattern Recognition, 2021, pp. 9934–9943.
  16. H. Tang, Z. Liu, S. Zhao, Y. Lin, J. Lin, H. Wang, and S. Han, “Searching efficient 3d architectures with sparse point-voxel convolution,” in European Conference on Computer Vision, 2020, pp. 685–702.
  17. X. Lai, Y. Chen, F. Lu, J. Liu, and J. Jia, “Spherical transformer for lidar-based 3d recognition,” in Computer Vision and Pattern Recognition, 2023, pp. 17 545–17 555.
  18. L. Velodyne, “Hdl-64e s2 and s2.1 user’s manual,” https://velodynelidar.com/, 2011, accessed: 2011-05-11.
  19. B. Wu, A. Wan, X. Yue, and K. Keutzer, “Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud,” in International Conference on Robotics and Automation, 2018, pp. 1887–1893.
  20. L. T. Triess, D. Peter, C. B. Rist, and J. M. Zöllner, “Scan-based semantic segmentation of lidar point clouds: An experimental study,” in IEEE Intelligent Vehicles Symposium, 2020, pp. 1116–1121.
  21. Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” in Computer Vision and Pattern Recognition, 2022, pp. 11 966–11 976.
  22. L. Fan, X. Xiong, F. Wang, N. Wang, and Z. Zhang, “Rangedet: In defense of range view for lidar-based 3d object detection,” in International Conference on Computer Vision, 2021, pp. 2918–2927.
  23. Z. Tian, X. Chu, X. Wang, X. Wei, and C. Shen, “Fully convolutional one-stage 3d object detection on lidar range images,” in Neural Information Processing Systems, vol. 35, 2022, pp. 34 899–34 911.
  24. S. Zhao, Y. Wang, B. Li, B. Wu, Y. Gao, P. Xu, T. Darrell, and K. Keutzer, “epointda: An end-to-end simulation-to-real domain adaptation framework for lidar point cloud segmentation,” in Association for the Advancement of Artificial Intelligence, vol. 35, no. 4, 2021, pp. 3500–3509.
  25. T.-W. Hui and K. N. Ngan, “Motion-depth: Rgb-d depth map enhancement with motion and depth in complement,” in Computer Vision and Pattern Recognition, 2014, pp. 3962–3969.
  26. M. Simone and C. Giancarlo, “Correction and interpolation of depth maps from structured light infrared sensors,” Signal Processing: Image Communication, vol. 41, pp. 28–39, 2016.
  27. I. Ashraf, S. Hur, and Y. Park, “An investigation of interpolation techniques to generate 2d intensity image from lidar data,” IEEE Access, vol. 5, pp. 8250–8260, 2017.
  28. F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <<<0.5MB model size,” in arXiv, 2016.
  29. J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” in arXiv, 2018.
  30. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Computer Vision and Pattern Recognition, 2016, pp. 770–778.
  31. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in Computer Vision and Pattern Recognition, 2009, pp. 248–255.
  32. T. Xiao, Y. Liu, B. Zhou, Y. Jiang, and J. Sun, “Unified perceptual parsing for scene understanding,” in European Conference on Computer Vision, 2018, pp. 432–448.
  33. J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.
  34. J. Zhang and S. Singh, “Loam: Lidar odometry and mapping in real-time,” in Robotics: Science and systems, vol. 2, no. 9, 2014, pp. 1–9.
  35. I. Vizzo, T. Guadagnino, B. Mersch, L. Wiesmann, J. Behley, and C. Stachniss, “Kiss-icp: In defense of point-to-point icp – simple, accurate, and robust registration if done the right way,” IEEE Robotics and Automation Letters, vol. 8, no. 2, pp. 1029–1036, 2023.
  36. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Computer Vision and Pattern Recognition, 2017, pp. 652–660.
  37. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in International Conference on Computer Vision, 2021, pp. 9992–10 002.
  38. W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao, “Pvtv2: Improved baselines with pyramid vision transformer,” Computational Visual Media, vol. 8, no. 3, pp. 1–10, 2022.
  39. X. Dong, J. Bao, D. Chen, W. Zhang, N. Yu, L. Yuan, D. Chen, and B. Guo, “Cswin transformer: A general vision transformer backbone with cross-shaped windows,” in Computer Vision and Pattern Recognition, 2022, pp. 12 114–12 124.
  40. D. Han, X. Pan, Y. Han, S. Song, and G. Huang, “Flatten transformer: Vision transformer using focused linear attention,” in International Conference on Computer Vision, 2023, pp. 5961–5971.
  41. D. Shi, “Transnext: Robust foveal visual perception for vision transformers,” in Computer Vision and Pattern Recognition, 2024.
  42. R. Razani, R. Cheng, E. Taghavi, and L. Bingbing, “Lite-hdseg: Lidar semantic segmentation using lite harmonic dense convolutions,” in International Conference on Robotics and Automation, 2021, p. 9550–9556.
  43. A. Xiao, J. Huang, D. Guan, K. Cui, S. Lu, and L. Shao, “Polarmix: A general data augmentation technique for lidar point clouds,” in Neural Information Processing Systems, vol. 35, 2022, pp. 11 035–11 048.
  44. L. Kong, J. Ren, L. Pan, and Z. Liu, “Lasermix for semi-supervised lidar semantic segmentation,” in Computer Vision and Pattern Recognition, 2023, pp. 21 705–21 715.
  45. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in Computer Vision and Pattern Recognition, 2016, pp. 3213–3223.
  46. Y. Liu, R. Chen, X. Li, L. Kong, Y. Yang, Z. Xia, Y. Bai, X. Zhu, Y. Ma, Y. Li, Y. Qiao, and Y. Hou, “Uniseg: A unified multi-modal lidar segmentation network and the openpcseg codebase,” in International Conference on Computer Vision, 2023, pp. 21 662–21 673.
  47. S. Li, X. Chen, Y. Liu, D. Dai, C. Stachniss, and J. Gall, “Multi-scale interaction for real-time lidar data segmentation on an embedded platform,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 738–745, 2021.
  48. C. Choy, J. Gwak, and S. Savarese, “4d spatio-temporal convnets: Minkowski convolutional neural networks,” in Computer Vision and Pattern Recognition, 2019, pp. 3070–3079.
  49. E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,” in Neural Information Processing Systems, vol. 34, 2021, pp. 12 077–12 090.
  50. X. Chen, S. Li, B. Mersch, L. Wiesmann, J. Gall, J. Behley, and C. Stachniss, “Moving object segmentation in 3d lidar data: A learning-based approach exploiting sequential data,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 6529–6536, 2021.
  51. M. Rochan, S. Aich, E. R. Corral-Soto, A. Nabatchian, and B. Liu, “Unsupervised domain adaptation in lidar semantic segmentation with self-supervision and gated adapters,” in International Conference on Robotics and Automation, 2022, pp. 2649–2655.
  52. G. Li, G. Kang, X. Wang, Y. Wei, and Y. Yang, “Adversarially masking synthetic to mimic real: Adaptive noise injection for point cloud segmentation adaptation,” in Computer Vision and Pattern Recognition, June 2023, pp. 20 464–20 474.
  53. J. Liu, G. Wang, Z. Liu, C. Jiang, M. Pollefeys, and H. Wang, “Regformer: An efficient projection-aware transformer network for large-scale point cloud registration,” in International Conference on Computer Vision, 2023, pp. 8417–8426.
  54. J. Ma, J. Zhang, J. Xu, R. Ai, W. Gu, and X. Chen, “Overlaptransformer: An efficient and yaw-angle-invariant transformer network for lidar-based place recognition,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 6958–6965, 2022.
Citations (1)

Summary

We haven't generated a summary for this paper yet.