No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation (2404.04050v1)
Abstract: To reduce the reliance on large-scale datasets, recent works in 3D segmentation resort to few-shot learning. Current 3D few-shot segmentation methods first pre-train models on 'seen' classes, and then evaluate their generalization performance on 'unseen' classes. However, the prior pre-training stage not only introduces excessive time overhead but also incurs a significant domain gap on 'unseen' classes. To tackle these issues, we propose a Non-parametric Network for few-shot 3D Segmentation, Seg-NN, and its Parametric variant, Seg-PN. Without training, Seg-NN extracts dense representations by hand-crafted filters and achieves comparable performance to existing parametric models. Due to the elimination of pre-training, Seg-NN can alleviate the domain gap issue and save a substantial amount of time. Based on Seg-NN, Seg-PN only requires training a lightweight QUEry-Support Transferring (QUEST) module, which enhances the interaction between the support set and query set. Experiments suggest that Seg-PN outperforms previous state-of-the-art method by +4.19% and +7.71% mIoU on S3DIS and ScanNet datasets respectively, while reducing training time by -90%, indicating its effectiveness and efficiency.
- Edge and corner detection for unorganized 3d point clouds with application to robotic welding. In IEEE International Conference on Intelligent Robots and Systems, pages 7350–7355, 2018.
- 3d semantic parsing of large-scale indoor spaces. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1534–1543, 2016.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In IEEE International Conference on Computer Vision, pages 5855–5864, 2021.
- Deep learning on 3d point clouds. Remote Sensing, 12(11):1729, 2020.
- Pimae: Point cloud and image interactive masked autoencoders for 3d object detection. CVPR 2023, 2023.
- Scannet: Richly-annotated 3d reconstructions of indoor scenes. In IEEE Conference on Computer Vision and Pattern Recognition, pages 5828–5839, 2017.
- Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043, 2017.
- Deep sparse rectifier neural networks. In Proceedings of the International Conference on Artificial Intelligence and Statistics, pages 315–323, 2011.
- Joint-mae: 2d-3d joint masked autoencoders for 3d point cloud pre-training. IJCAI 2023, 2023.
- Prototype adaption and projection for few-and zero-shot 3d point cloud semantic segmentation. IEEE Transactions on Image Processing, 2023.
- Query-guided support prototypes for few-shot 3d indoor segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pages 448–456, 2015.
- Neural style transfer: A review. IEEE Transactions on Visualization and Computer Graphics, 26(11):3365–3385, 2019.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Fuseseg: Lidar point cloud segmentation fusing multi-modal data. In IEEE Winter Conference on Applications of Computer Vision, pages 1874–1883, 2020.
- Stratified transformer for 3d point cloud segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 8500–8509, 2022.
- Learning what not to segment: A new perspective on few-shot segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8057–8067, 2022.
- Fast and robust 3d person detector and posture estimator for mobile robotic applications. In International Conference on Robotics and Automation, pages 4869–4875, 2019.
- Primitive3d: 3d object dataset synthesis from randomly assembled primitives. In IEEE Conference on Computer Vision and Pattern Recognition, pages 15947–15957, 2022.
- Transformer-based visual segmentation: A survey. arXiv preprint arXiv:2304.09854, 2023.
- Pointcnn: Convolution on x-transformed points. Advances in Neural Information Processing Systems, 31, 2018.
- Partslip: Low-shot part segmentation for 3d point clouds via pretrained image-language models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 21736–21746, 2023.
- Theory of the frequency principle for general deep neural networks. arXiv preprint arXiv:1906.09235, 2019.
- Rethinking network design and local geometry in point cloud: A simple residual mlp framework. arXiv preprint arXiv:2202.07123, 2022.
- A review of location encoding for GeoAI: methods and applications. International Journal of Geographical Information Science, 36(4):639–673, 2022.
- Bidirectional feature globalization for few-shot semantic segmentation of 3d point cloud scenes. In International Conference on 3D Vision, pages 505–514, 2022.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Boosting few-shot 3d point cloud segmentation via query-guided enhancement. In Proceedings of the ACM International Conference on Multimedia, pages 1895–1904, 2023.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 652–660, 2017a.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems, 30, 2017b.
- Pointnext: Revisiting pointnet++ with improved training and scaling strategies. In Advances in Neural Information Processing Systems, 2022.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention, pages 234–241, 2015.
- One-shot learning for semantic segmentation. arXiv preprint arXiv:1709.03410, 2017.
- Self-supervised few-shot learning on point clouds. Advances in Neural Information Processing Systems, 33:7212–7221, 2020.
- Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
- Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Proceedings of the IEEE International Conference on Computer Vision, pages 1588–1597, 2019.
- Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.
- Matching networks for one shot learning. In Advances in Neural Information Processing Systems, pages 3630–3638, 2016.
- Few-shot point cloud semantic segmentation via contrastive self-supervision and multi-resolution attention. In IEEE International Conference on Robotics and Automation, pages 2811–2817, 2023.
- Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics, 38(5):1–12, 2019.
- Pillar-based object detection for autonomous driving. In European Conference on Computer Vision, pages 18–34, 2020.
- Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. Advances in Neural Information Processing Systems, 29, 2016.
- 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1912–1920, 2015.
- Generalized few-shot point cloud segmentation via geometric words. In Proceedings of the IEEE International Conference on Computer Vision, pages 21506–21515, 2023.
- Swin3d: A pretrained transformer backbone for 3d indoor scene understanding. arXiv preprint arXiv:2304.06906, 2023.
- A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics, 35(6):1–12, 2016.
- A lidar point cloud generator: from a virtual world to autonomous driving. In ACM on International Conference on Multimedia Retrieval, pages 458–464, 2018.
- Few-shot 3d point cloud semantic segmentation via stratified class-specific attention based transformer network. arXiv preprint arXiv:2303.15654, 2023a.
- Tip-adapter: Training-free clip-adapter for better vision-language modeling. arXiv preprint arXiv:2111.03930, 2021.
- Point-m2ae: Multi-scale masked autoencoders for hierarchical point cloud pre-training. NeurIPS 2022, 2022a.
- Nearest neighbors meet deep neural networks for point cloud analysis. In WACV 2023, 2022b.
- Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners. CVPR 2023, 2023b.
- Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders. CVPR 2023, 2023c.
- Parameter is not all you need: Starting from non-parametric networks for 3d point cloud analysis. arXiv preprint arXiv:2303.08134, 2023d.
- Explicitizing an implicit bias of the frequency principle in two-layer neural networks. arXiv preprint arXiv:1905.10264, 2019.
- Point transformer. In IEEE International Conference on Computer Vision, pages 16259–16268, 2021a.
- Few-shot 3d point cloud semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 8873–8882, 2021b.
- Robust point cloud processing through positional embedding, 2023.
- Cross-class bias rectification for point cloud few-shot segmentation. IEEE Transactions on Multimedia, 2023.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.