Rethinking Few-shot 3D Point Cloud Semantic Segmentation (2403.00592v1)
Abstract: This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS), with a focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution. The former arises from non-uniform point sampling, allowing models to distinguish the density disparities between foreground and background for easier segmentation. The latter results from sampling only 2,048 points, limiting semantic information and deviating from the real-world practice. To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built. Moreover, we propose a novel FS-PCS model. While previous methods are based on feature optimization by mainly refining support features to enhance prototypes, our method is based on correlation optimization, referred to as Correlation Optimization Segmentation (COSeg). Specifically, we compute Class-specific Multi-prototypical Correlation (CMC) for each query point, representing its correlations to category prototypes. Then, we propose the Hyper Correlation Augmentation (HCA) module to enhance CMC. Furthermore, tackling the inherent property of few-shot training to incur base susceptibility for models, we propose to learn non-parametric prototypes for the base classes during training. The learned base prototypes are used to calibrate correlations for the background class through a Base Prototypes Calibration (BPC) module. Experiments on popular datasets demonstrate the superiority of COSeg over existing methods. The code is available at: https://github.com/ZhaochongAn/COSeg
- 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1534–1543, 2016.
- Point convolutional neural networks by extension operators. arXiv preprint arXiv:1803.10091, 2018.
- Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
- Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9297–9307, 2019.
- Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 1907–1915, 2017.
- Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289, 2015.
- Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5828–5839, 2017.
- Vote3deep: Fast object detection in 3d point clouds using efficient convolutional neural networks. In IEEE International Conference on Robotics and Automation (ICRA), pages 1355–1361. IEEE, 2017.
- Exploring spatial context for 3d semantic segmentation of point clouds. In Proceedings of the IEEE international conference on computer vision workshops, pages 716–724, 2017.
- Know what your neighbors do: 3d semantic segmentation of point clouds. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pages 0–0, 2018.
- Prototype adaption and projection for few-and zero-shot 3d point cloud semantic segmentation. IEEE Transactions on Image Processing, 2023.
- Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11108–11117, 2020.
- Pointwise convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 984–993, 2018.
- Pointsift: A sift-like network module for 3d point cloud semantic segmentation. arXiv preprint arXiv:1807.00652, 2018.
- Transformers are rnns: Fast autoregressive transformers with linear attention. In International conference on machine learning, pages 5156–5165. PMLR, 2020.
- Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
- A-cnn: Annularly convolutional neural networks on point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7421–7430, 2019.
- Stratified transformer for 3d point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8500–8509, 2022.
- Learning what not to segment: A new perspective on few-shot segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8057–8067, 2022.
- Spherical kernel for efficient graph convolution on 3d point clouds. IEEE transactions on pattern analysis and machine intelligence, 43(10):3664–3680, 2020.
- Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems, 31, 2018.
- Convolution in the cloud: Learning deformable kernels in 3d graph convolution networks for point cloud analysis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1800–1809, 2020.
- Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8895–8904, 2019a.
- Point-voxel cnn for efficient 3d deep learning. Advances in Neural Information Processing Systems, 32, 2019b.
- Bidirectional feature globalization for few-shot semantic segmentation of 3d point cloud scenes. In 2022 International Conference on 3D Vision (3DV), pages 505–514. IEEE, 2022.
- Rangenet++: Fast and accurate lidar semantic segmentation. In 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 4213–4220. IEEE, 2019.
- Geometric deep learning on graphs and manifolds using mixture model cnns. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5115–5124, 2017.
- Pyramid architecture for multi-scale processing in point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17284–17294, 2022.
- Boosting few-shot 3d point cloud segmentation via query-guided enhancement. arXiv preprint arXiv:2308.03177, 2023.
- Fast point transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16949–16958, 2022.
- Point cloud semantic segmentation using a deep learning framework for cultural heritage. Remote Sensing, 12(6):1005, 2020.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017a.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017b.
- 3d graph neural networks for rgbd semantic segmentation. In Proceedings of the IEEE international conference on computer vision, pages 5199–5208, 2017c.
- Learning inner-group relations on point clouds. In Proceedings of the IEEE/CVF international conference on computer vision, pages 15477–15487, 2021.
- Meta-learning for semi-supervised few-shot classification. arXiv preprint arXiv:1803.00676, 2018.
- Mining point cloud local structures by kernel correlation and graph pooling. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4548–4557, 2018.
- Prototypical networks for few-shot learning. Advances in neural information processing systems, 30, 2017.
- Tangent convolutions for dense prediction in 3d. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3887–3896, 2018.
- Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6411–6420, 2019.
- Prior guided feature enrichment network for few-shot segmentation. IEEE transactions on pattern analysis and machine intelligence, 44(2):1050–1065, 2020.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Feastnet: Feature-steered graph convolutions for 3d shape analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2598–2606, 2018.
- Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016.
- Few-shot point cloud semantic segmentation via contrastive self-supervision and multi-resolution attention. In IEEE International Conference on Robotics and Automation (ICRA), pages 2811–2817. IEEE, 2023.
- Graph attention convolution for point cloud semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10296–10305, 2019a.
- Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog), 38(5):1–12, 2019b.
- Spidercnn: Deep learning on point sets with parameterized convolutional filters. In Proceedings of the European conference on computer vision (ECCV), pages 87–102, 2018.
- 3d recurrent neural networks with context fusion for point cloud semantic segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 403–417, 2018.
- Patchformer: An efficient point transformer with patch attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11799–11808, 2022.
- Few-shot 3d point cloud semantic segmentation via stratified class-specific attention based transformer network. In AAAI, 2023a.
- Improving graph representation for point cloud segmentation via attentive filtering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1244–1254, 2023b.
- Sg-one: Similarity guidance network for one-shot semantic segmentation. IEEE transactions on cybernetics, 50(9):3855–3865, 2020.
- Shellnet: Efficient point cloud convolutional neural networks using concentric shells statistics. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1607–1616, 2019.
- Point transformer. In Proceedings of the IEEE/CVF international conference on computer vision, pages 16259–16268, 2021a.
- Few-shot 3d point cloud semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8873–8882, 2021b.
- Adaptive graph convolution for point cloud analysis. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4965–4974, 2021.
- Cross-class bias rectification for point cloud few-shot segmentation. IEEE Transactions on Multimedia, 2023.