Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PointHR: Exploring High-Resolution Architectures for 3D Point Cloud Segmentation (2310.07743v1)

Published 11 Oct 2023 in cs.CV

Abstract: Significant progress has been made recently in point cloud segmentation utilizing an encoder-decoder framework, which initially encodes point clouds into low-resolution representations and subsequently decodes high-resolution predictions. Inspired by the success of high-resolution architectures in image dense prediction, which always maintains a high-resolution representation throughout the entire learning process, we consider it also highly important for 3D dense point cloud analysis. Therefore, in this paper, we explore high-resolution architectures for 3D point cloud segmentation. Specifically, we generalize high-resolution architectures using a unified pipeline named PointHR, which includes a knn-based sequence operator for feature extraction and a differential resampling operator to efficiently communicate different resolutions. Additionally, we propose to avoid numerous on-the-fly computations of high-resolution architectures by pre-computing the indices for both sequence and resampling operators. By doing so, we deliver highly competitive high-resolution architectures while capitalizing on the benefits of well-designed point cloud blocks without additional effort. To evaluate these architectures for dense point cloud analysis, we conduct thorough experiments using S3DIS and ScanNetV2 datasets, where the proposed PointHR outperforms recent state-of-the-art methods without any bells and whistles. The source code is available at \url{https://github.com/haibo-qiu/PointHR}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Salsanet: Fast road and vehicle segmentation in lidar point clouds for autonomous driving. In IEEE Intelligent Vehicles Symposium, pages 926–932, 2020.
  2. Pointxr: A toolbox for visualization and subjective evaluation of point clouds in virtual reality. In International Conference on Quality of Multimedia Experience (QoMEX), pages 1–6. IEEE, 2020.
  3. 3d semantic parsing of large-scale indoor spaces. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1534–1543, 2016.
  4. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 40(4):834–848, 2017.
  5. An overview of augmented reality technology. In Journal of Physics: Conference Series, page 022082. IOP Publishing, 2019.
  6. 2-s3net: Attentive feature fusion with adaptive feature selection for sparse semantic segmentation network. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12547–12556, 2021.
  7. Pointmixer: Mlp-mixer for point cloud understanding. In European Conference on Computer Vision (ECCV), pages 620–640. Springer, 2022.
  8. 4d spatio-temporal convnets: Minkowski convolutional neural networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3075–3084, 2019.
  9. Salsanext: Fast, uncertainty-aware semantic segmentation of lidar point clouds. In International Symposium on Visual Computing (ISVC), pages 207–222, 2020.
  10. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5828–5839, 2017.
  11. The farthest point strategy for progressive image sampling. IEEE Transactions on Image Processing (TIP), 6(9):1305–1315, 1997.
  12. 3d semantic segmentation with submanifold sparse convolutional networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9224–9232, 2018.
  13. Pct: Point cloud transformer. Computational Visual Media, 7(2):187–199, 2021.
  14. Occuseg: Occupancy-aware 3d instance segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2940–2949, 2020.
  15. Deep residual learning for image recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
  16. Randla-net: Efficient semantic segmentation of large-scale point clouds. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11108–11117, 2020.
  17. Bidirectional projection network for cross dimension scene understanding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14373–14382, 2021.
  18. Hierarchical point-edge interaction network for point cloud semantic segmentation. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 10433–10441, 2019.
  19. Stratified transformer for 3d point cloud segmentation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  20. So-net: Self-organizing network for point cloud analysis. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9397–9406, 2018a.
  21. Integrate point-cloud segmentation with 3d lidar scan-matching for mobile robot localization and mapping. Sensors, 20(1):237, 2019.
  22. Pointcnn: Convolution on x-transformed points. Advances in Neural Information Processing Systems (NeurIPS), 31, 2018b.
  23. Deep learning for lidar point clouds in autonomous driving: A review. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 32(8):3412–3432, 2020.
  24. Meta architecure for point cloud analysis. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  25. Point-voxel cnn for efficient 3d deep learning. Advances in Neural Information Processing Systems (NeurIPS), 32, 2019.
  26. A closer look at local aggregation operators in point cloud analysis. In European Conference on Computer Vision (ECCV), pages 326–342. Springer, 2020.
  27. One thing one click: A self-training approach for weakly supervised 3d semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1726–1736, 2021.
  28. Fully convolutional networks for semantic segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3431–3440, 2015.
  29. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  30. Cga-net: Category guided aggregation for point cloud semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11693–11702, 2021.
  31. Rethinking network design and local geometry in point cloud: A simple residual mlp framework. International Conference on Learning Representations (ICLR), 2022.
  32. Rangenet++: Fast and accurate lidar semantic segmentation. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4213–4220, 2019.
  33. Fast point transformer. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  34. Pointnet: Deep learning on point sets for 3d classification and segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 652–660, 2017a.
  35. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems (NeurIPS), 30, 2017b.
  36. Assanet: An anisotropic separable set abstraction for efficient point cloud representation learning. Advances in Neural Information Processing Systems (NeurIPS), 34:28119–28130, 2021.
  37. Pointnext: Revisiting pointnet++ with improved training and scaling strategies. Advances in Neural Information Processing Systems (NeurIPS), 2022.
  38. GFNet: Geometric flow network for 3d point cloud semantic segmentation. Transactions on Machine Learning Research, 2022.
  39. Collect-and-distribute transformer for 3d point cloud analysis. arXiv preprint arXiv:2306.01257, 2023.
  40. Super-convergence: Very fast training of neural networks using large learning rates. In Artificial intelligence and machine learning for multi-domain operations applications, pages 369–386. SPIE, 2019.
  41. Deep high-resolution representation learning for human pose estimation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5693–5703, 2019.
  42. Searching efficient 3d architectures with sparse point-voxel convolution. In European Conference on Computer Vision (ECCV), pages 685–702, 2020.
  43. Contrastive boundary learning for point cloud segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8489–8499, 2022.
  44. Segcloud: Semantic segmentation of 3d point clouds. In International Conference on 3D Vision (3DV), pages 537–547. IEEE, 2017.
  45. Kpconv: Flexible and deformable convolution for point clouds. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 6411–6420, 2019.
  46. Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30, 2017.
  47. Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 43(10):3349–3364, 2020.
  48. Dynamic graph cnn for learning on point clouds. ACM Transactions On Graphics (TOG), 38(5):1–12, 2019.
  49. Pointconv: Deep convolutional networks on 3d point clouds. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9621–9630, 2019.
  50. Point transformer v2: Grouped vector attention and partition-based pooling. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  51. 3d shapenets: A deep representation for volumetric shapes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1912–1920, 2015.
  52. Walk in the cloud: Learning curves for point clouds shape analysis. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 915–924, 2021.
  53. You only group once: Efficient point-cloud processing with token representation and relation inference module. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4589–4596. IEEE, 2021.
  54. Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5589–5598, 2020.
  55. A novel system for off-line 3d seam extraction and path planning based on point cloud segmentation for arc welding robot. Robotics and Computer-Integrated Manufacturing, 64:101929, 2020.
  56. Swin3d: A pretrained transformer backbone for 3d indoor scene understanding. arXiv preprint arXiv:2304.06906, 2023.
  57. A scalable active framework for region annotation in 3d shape collections. SIGGRAPH Asia, 2016.
  58. Hrformer: High-resolution transformer for dense prediction. Advances in Neural Information Processing Systems (NeurIPS), 2021.
  59. Deep high-resolution representation learning for cross-resolution person re-identification. IEEE Transactions on Image Processing (TIP), 30:8913–8925, 2021.
  60. Polarnet: An improved grid representation for online lidar point clouds semantic segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9601–9610, 2020.
  61. Point transformer. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 16259–16268, 2021.
  62. Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9939–9948, 2021.

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com