Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
146 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond the Label Itself: Latent Labels Enhance Semi-supervised Point Cloud Panoptic Segmentation (2312.08234v2)

Published 13 Dec 2023 in cs.CV

Abstract: As the exorbitant expense of labeling autopilot datasets and the growing trend of utilizing unlabeled data, semi-supervised segmentation on point clouds becomes increasingly imperative. Intuitively, finding out more ``unspoken words'' (i.e., latent instance information) beyond the label itself should be helpful to improve performance. In this paper, we discover two types of latent labels behind the displayed label embedded in LiDAR and image data. First, in the LiDAR Branch, we propose a novel augmentation, Cylinder-Mix, which is able to augment more yet reliable samples for training. Second, in the Image Branch, we propose the Instance Position-scale Learning (IPSL) Module to learn and fuse the information of instance position and scale, which is from a 2D pre-trained detector and a type of latent label obtained from 3D to 2D projection. Finally, the two latent labels are embedded into the multi-modal panoptic segmentation network. The ablation of the IPSL module demonstrates its robust adaptability, and the experiments evaluated on SemanticKITTI and nuScenes demonstrate that our model outperforms the state-of-the-art method, LaserMix.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. A Benchmark for LiDAR-based Panoptic Segmentation based on KITTI. In 2021 IEEE International Conference on Robotics and Automation (ICRA), 13596–13603. IEEE.
  2. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11621–11631.
  3. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155.
  4. Semi-supervised semantic segmentation with cross pseudo supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2613–2622.
  5. Sspc-net: Semi-supervised semantic 3d point cloud segmentation network. In Proceedings of the AAAI conference on artificial intelligence, volume 35, 1140–1147.
  6. Contributors, G.-S.-A. 2023. Grounded-Segment-Anything.
  7. Deep learning for image and point cloud fusion in autonomous driving: A review. IEEE Transactions on Intelligent Transportation Systems, 23(2): 722–739.
  8. Point-cloud based 3D object detection and classification methods for self-driving applications: A survey and taxonomy. Information Fusion, 68: 161–191.
  9. Panoptic nuscenes: A large-scale benchmark for lidar panoptic segmentation and tracking. IEEE Robotics and Automation Letters, 7(2): 3795–3802.
  10. Semi-supervised semantic segmentation needs strong, varied perturbations. arXiv preprint arXiv:1906.01916.
  11. Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition, 3354–3361. IEEE.
  12. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  13. Sqn: Weakly-supervised semantic segmentation of large-scale 3d point clouds. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVII, 600–619. Springer.
  14. Guided point contrastive learning for semi-supervised point cloud semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, 6423–6432.
  15. Panoptic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9404–9413.
  16. Segment anything. arXiv preprint arXiv:2304.02643.
  17. Lasermix for semi-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  18. Panoptic-PHNet: Towards Real-Time and High-Precision LiDAR Panoptic Segmentation via Clustering Pseudo Heatmap. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11809–11818.
  19. DeepI2P: Image-to-point cloud registration via deep classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15960–15969.
  20. Bevdepth: Acquisition of reliable depth for multi-view 3d object detection. arXiv preprint arXiv:2206.10092.
  21. Bevfusion: A simple and robust lidar-camera fusion framework. arXiv preprint arXiv:2205.13790.
  22. Interactive image segmentation with first click attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 13339–13348.
  23. Deep dual consecutive network for human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 525–534.
  24. BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation. arXiv preprint arXiv:2205.13542.
  25. In defense of pre-trained imagenet architectures for real-time semantic segmentation of road-driving images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12607–12616.
  26. Detmatch: Two teachers are better than one for joint 2d and 3d semi-supervised object detection. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part X, 370–389. Springer.
  27. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 652–660.
  28. Learning transferable visual models from natural language supervision. In International conference on machine learning, 8748–8763. PMLR.
  29. A survey of recent interactive image segmentation methods. Computational visual media, 6: 355–384.
  30. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
  31. Automotive LiDAR technology: A survey. IEEE Transactions on Intelligent Transportation Systems, 23(7): 6282–6297.
  32. Beyond the point cloud: from transductive to semi-supervised learning. In Proceedings of the 22nd international conference on Machine learning, 824–831.
  33. Reviving iterative training with mask guidance for interactive segmentation. In 2022 IEEE International Conference on Image Processing (ICIP), 3141–3145. IEEE.
  34. PUPS: Point Cloud Unified Panoptic Segmentation. arXiv preprint arXiv:2302.06185.
  35. Positive-Negative Receptive Field Reasoning for Omni-Supervised 3D Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence.
  36. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems, 30.
  37. A survey on deep domain adaptation for lidar perception. In 2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops), 350–357. IEEE.
  38. Scribble-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2697–2707.
  39. Applications of 3D point cloud data in the construction industry: A fifteen-year review from 2004 to 2018. Advanced Engineering Informatics, 39: 306–319.
  40. Semi-supervised 3D object detection via adaptive pseudo-labeling. In 2021 IEEE International Conference on Image Processing (ICIP), 3183–3187. IEEE.
  41. Weakly supervised semantic point cloud segmentation: Towards 10x fewer labels. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 13706–13715.
  42. Unsupervised Adaptation from Repeated Traversals for Autonomous Driving. Advances in Neural Information Processing Systems, 35: 27716–27729.
  43. LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and Semantic-Aware Alignment. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3662–3671.
  44. Panoptic-polarnet: Proposal-free lidar point cloud panoptic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13194–13203.
  45. Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9939–9948.
  46. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In Proceedings of the European conference on computer vision (ECCV), 289–305.
Citations (1)

Summary

We haven't generated a summary for this paper yet.