Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance (2312.07530v3)
Abstract: Weakly supervised 3D object detection aims to learn a 3D detector with lower annotation cost, e.g., 2D labels. Unlike prior work which still relies on few accurate 3D annotations, we propose a framework to study how to leverage constraints between 2D and 3D domains without requiring any 3D labels. Specifically, we employ visual data from three perspectives to establish connections between 2D and 3D domains. First, we design a feature-level constraint to align LiDAR and image features based on object-aware regions. Second, the output-level constraint is developed to enforce the overlap between 2D and projected 3D box estimations. Finally, the training-level constraint is utilized by producing accurate and consistent 3D pseudo-labels that align with the visual data. We conduct extensive experiments on the KITTI dataset to validate the effectiveness of the proposed three constraints. Without using any 3D labels, our method achieves favorable performance against state-of-the-art approaches and is competitive with the method that uses 500-frame 3D annotations. Code will be made publicly available at https://github.com/kuanchihhuang/VG-W3D.
- End-to-end object detection with transformers. In European Conference on Computer Vision (ECCV), 2020.
- Emerging properties in self-supervised vision transformers. In IEEE International Conference on Computer Vision (ICCV), 2021.
- 3d object proposals for accurate object class detection. In Advances in Neural Information Processing Systems (NeurIPS), 2015.
- Voxelnext: Fully sparse voxelnet for 3d object detection and tracking. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Voxel r-cnn: Towards high performance voxel-based 3d object detection. AAAI Conference on Artificial Intelligence (AAAI), 2021.
- Are we ready for autonomous driving? the kitti vision benchmark suite. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.
- Structure aware single-stage 3d object detection from point cloud. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Pg-rcnn: Semantic surface point generation for 3d object detection. In IEEE International Conference on Computer Vision (ICCV), 2023.
- Pointpillars: Fast encoders for object detection from point clouds. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Unifying voxel-based representation with transformer for 3d object detection. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Microsoft coco: Common objects in context. arXiv preprint arXiv:1405.0312, 2014.
- Multimodal transformer for automatic 3d annotation and object detection. In European Conference on Computer Vision (ECCV), 2022.
- Map-gen: An automated 3d-box annotation flow with multimodal attention point generator. In ICPR, 2022.
- M3dssd: Monocular 3d single stage object detector. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Voxel transformer for 3d object detection. IEEE International Conference on Computer Vision (ICCV), 2021.
- Towards a weakly supervised framework for 3d point cloud object detection and annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020.
- Weakly supervised 3d object detection from lidar point cloud. In European Conference on Computer Vision (ECCV), 2020.
- 3d object detection with pointformer. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Generalized intersection over union: A metric and a loss for bounding box regression. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Pointrcnn: 3d object proposal generation and detection from point cloud. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
- Point-gnn: Graph neural network for 3d object detection in a point cloud. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Disentangling monocular 3d object detection. In IEEE International Conference on Computer Vision (ICCV), 2019.
- Cut and learn for unsupervised object detection and instance segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection. In IEEE International Conference on Robotics and Automation (ICRA), 2021.
- Cia-ssd: Confident iou-aware single-stage object detector from point cloud. In AAAI Conference on Artificial Intelligence (AAAI), 2021.
- Objects as points. In arXiv preprint arXiv:1904.07850, 2019.
- Voxelnet: End-to-end learning for point cloud based 3d object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- Kuan-Chih Huang (10 papers)
- Yi-Hsuan Tsai (69 papers)
- Ming-Hsuan Yang (377 papers)