Papers
Topics
Authors
Recent
2000 character limit reached

No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation (2404.04050v1)

Published 5 Apr 2024 in cs.CV

Abstract: To reduce the reliance on large-scale datasets, recent works in 3D segmentation resort to few-shot learning. Current 3D few-shot segmentation methods first pre-train models on 'seen' classes, and then evaluate their generalization performance on 'unseen' classes. However, the prior pre-training stage not only introduces excessive time overhead but also incurs a significant domain gap on 'unseen' classes. To tackle these issues, we propose a Non-parametric Network for few-shot 3D Segmentation, Seg-NN, and its Parametric variant, Seg-PN. Without training, Seg-NN extracts dense representations by hand-crafted filters and achieves comparable performance to existing parametric models. Due to the elimination of pre-training, Seg-NN can alleviate the domain gap issue and save a substantial amount of time. Based on Seg-NN, Seg-PN only requires training a lightweight QUEry-Support Transferring (QUEST) module, which enhances the interaction between the support set and query set. Experiments suggest that Seg-PN outperforms previous state-of-the-art method by +4.19% and +7.71% mIoU on S3DIS and ScanNet datasets respectively, while reducing training time by -90%, indicating its effectiveness and efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Edge and corner detection for unorganized 3d point clouds with application to robotic welding. In IEEE International Conference on Intelligent Robots and Systems, pages 7350–7355, 2018.
  2. 3d semantic parsing of large-scale indoor spaces. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1534–1543, 2016.
  3. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In IEEE International Conference on Computer Vision, pages 5855–5864, 2021.
  4. Deep learning on 3d point clouds. Remote Sensing, 12(11):1729, 2020.
  5. Pimae: Point cloud and image interactive masked autoencoders for 3d object detection. CVPR 2023, 2023.
  6. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In IEEE Conference on Computer Vision and Pattern Recognition, pages 5828–5839, 2017.
  7. Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043, 2017.
  8. Deep sparse rectifier neural networks. In Proceedings of the International Conference on Artificial Intelligence and Statistics, pages 315–323, 2011.
  9. Joint-mae: 2d-3d joint masked autoencoders for 3d point cloud pre-training. IJCAI 2023, 2023.
  10. Prototype adaption and projection for few-and zero-shot 3d point cloud semantic segmentation. IEEE Transactions on Image Processing, 2023.
  11. Query-guided support prototypes for few-shot 3d indoor segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 2023.
  12. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pages 448–456, 2015.
  13. Neural style transfer: A review. IEEE Transactions on Visualization and Computer Graphics, 26(11):3365–3385, 2019.
  14. Segment anything. arXiv preprint arXiv:2304.02643, 2023.
  15. Fuseseg: Lidar point cloud segmentation fusing multi-modal data. In IEEE Winter Conference on Applications of Computer Vision, pages 1874–1883, 2020.
  16. Stratified transformer for 3d point cloud segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 8500–8509, 2022.
  17. Learning what not to segment: A new perspective on few-shot segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8057–8067, 2022.
  18. Fast and robust 3d person detector and posture estimator for mobile robotic applications. In International Conference on Robotics and Automation, pages 4869–4875, 2019.
  19. Primitive3d: 3d object dataset synthesis from randomly assembled primitives. In IEEE Conference on Computer Vision and Pattern Recognition, pages 15947–15957, 2022.
  20. Transformer-based visual segmentation: A survey. arXiv preprint arXiv:2304.09854, 2023.
  21. Pointcnn: Convolution on x-transformed points. Advances in Neural Information Processing Systems, 31, 2018.
  22. Partslip: Low-shot part segmentation for 3d point clouds via pretrained image-language models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 21736–21746, 2023.
  23. Theory of the frequency principle for general deep neural networks. arXiv preprint arXiv:1906.09235, 2019.
  24. Rethinking network design and local geometry in point cloud: A simple residual mlp framework. arXiv preprint arXiv:2202.07123, 2022.
  25. A review of location encoding for GeoAI: methods and applications. International Journal of Geographical Information Science, 36(4):639–673, 2022.
  26. Bidirectional feature globalization for few-shot semantic segmentation of 3d point cloud scenes. In International Conference on 3D Vision, pages 505–514, 2022.
  27. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
  28. Boosting few-shot 3d point cloud segmentation via query-guided enhancement. In Proceedings of the ACM International Conference on Multimedia, pages 1895–1904, 2023.
  29. Pointnet: Deep learning on point sets for 3d classification and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 652–660, 2017a.
  30. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems, 30, 2017b.
  31. Pointnext: Revisiting pointnet++ with improved training and scaling strategies. In Advances in Neural Information Processing Systems, 2022.
  32. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention, pages 234–241, 2015.
  33. One-shot learning for semantic segmentation. arXiv preprint arXiv:1709.03410, 2017.
  34. Self-supervised few-shot learning on point clouds. Advances in Neural Information Processing Systems, 33:7212–7221, 2020.
  35. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
  36. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Proceedings of the IEEE International Conference on Computer Vision, pages 1588–1597, 2019.
  37. Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.
  38. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, pages 3630–3638, 2016.
  39. Few-shot point cloud semantic segmentation via contrastive self-supervision and multi-resolution attention. In IEEE International Conference on Robotics and Automation, pages 2811–2817, 2023.
  40. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics, 38(5):1–12, 2019.
  41. Pillar-based object detection for autonomous driving. In European Conference on Computer Vision, pages 18–34, 2020.
  42. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. Advances in Neural Information Processing Systems, 29, 2016.
  43. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1912–1920, 2015.
  44. Generalized few-shot point cloud segmentation via geometric words. In Proceedings of the IEEE International Conference on Computer Vision, pages 21506–21515, 2023.
  45. Swin3d: A pretrained transformer backbone for 3d indoor scene understanding. arXiv preprint arXiv:2304.06906, 2023.
  46. A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics, 35(6):1–12, 2016.
  47. A lidar point cloud generator: from a virtual world to autonomous driving. In ACM on International Conference on Multimedia Retrieval, pages 458–464, 2018.
  48. Few-shot 3d point cloud semantic segmentation via stratified class-specific attention based transformer network. arXiv preprint arXiv:2303.15654, 2023a.
  49. Tip-adapter: Training-free clip-adapter for better vision-language modeling. arXiv preprint arXiv:2111.03930, 2021.
  50. Point-m2ae: Multi-scale masked autoencoders for hierarchical point cloud pre-training. NeurIPS 2022, 2022a.
  51. Nearest neighbors meet deep neural networks for point cloud analysis. In WACV 2023, 2022b.
  52. Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners. CVPR 2023, 2023b.
  53. Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders. CVPR 2023, 2023c.
  54. Parameter is not all you need: Starting from non-parametric networks for 3d point cloud analysis. arXiv preprint arXiv:2303.08134, 2023d.
  55. Explicitizing an implicit bias of the frequency principle in two-layer neural networks. arXiv preprint arXiv:1905.10264, 2019.
  56. Point transformer. In IEEE International Conference on Computer Vision, pages 16259–16268, 2021a.
  57. Few-shot 3d point cloud semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 8873–8882, 2021b.
  58. Robust point cloud processing through positional embedding, 2023.
  59. Cross-class bias rectification for point cloud few-shot segmentation. IEEE Transactions on Multimedia, 2023.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: