VirtualPainting: Addressing Sparsity with Virtual Points and Distance-Aware Data Augmentation for 3D Object Detection (2312.16141v1)
Abstract: In recent times, there has been a notable surge in multimodal approaches that decorates raw LiDAR point clouds with camera-derived features to improve object detection performance. However, we found that these methods still grapple with the inherent sparsity of LiDAR point cloud data, primarily because fewer points are enriched with camera-derived features for sparsely distributed objects. We present an innovative approach that involves the generation of virtual LiDAR points using camera images and enhancing these virtual points with semantic labels obtained from image-based segmentation networks to tackle this issue and facilitate the detection of sparsely distributed objects, particularly those that are occluded or distant. Furthermore, we integrate a distance aware data augmentation (DADA) technique to enhance the models capability to recognize these sparsely distributed objects by generating specialized training samples. Our approach offers a versatile solution that can be seamlessly integrated into various 3D frameworks and 2D semantic segmentation methods, resulting in significantly improved overall detection accuracy. Evaluation on the KITTI and nuScenes datasets demonstrates substantial enhancements in both 3D and birds eye view (BEV) detection benchmarks
- nuscenes: A multimodal dataset for autonomous driving. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11618–11628, 2020.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 77–85, 2017.
- Multi-view 3d object detection network for autonomous driving. CoRR, abs/1611.07759, 2016.
- The cityscapes dataset for semantic urban scene understanding. CoRR, abs/1604.01685, 2016.
- Voxel R-CNN: towards high performance voxel-based 3d object detection. CoRR, abs/2012.15712, 2020.
- Sniffer faster r-cnn ++: An efficient camera-lidar object detector with proposal refinement on fused candidates. ACM J. Auton. Transport. Syst., oct 2023. Just Accepted.
- Sniffer faster r-cnn: A joint camera-lidar object detection framework with proposal refinement. In 2023 IEEE International Conference on Mobility, Operations, Services and Technologies (MOST), pages 1–10, 2023.
- Oasd: An open approach to self-driving vehicle. In 2021 Fourth International Conference on Connected and Autonomous Driving (MetroCAD), pages 54–61, 2021.
- Centernet: Keypoint triplets for object detection. CoRR, abs/1904.08189, 2019.
- Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 3354–3361, 2012.
- Joint 3d proposal generation and object detection from view aggregation. CoRR, abs/1712.02294, 2017.
- Pointpillars: Fast encoders for object detection from point clouds. CoRR, abs/1812.05784, 2018.
- Multi-task multi-sensor fusion for 3d object detection. CoRR, abs/2012.12397, 2020.
- Sparse convolutional neural networks. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 806–814, 2015.
- Clocs: Camera-lidar object candidates fusion for 3d object detection. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 10386–10393, 2020.
- Pytorch: An imperative style, high-performance deep learning library. CoRR, abs/1912.01703, 2019.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. CoRR, abs/1706.02413, 2017.
- Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10526–10535, 2020.
- Pointrcnn: 3d object proposal generation and detection from point cloud. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–779, 2019.
- Penet: Pre-enhanced network for object detection and instance segmentation. In 2023 3rd International Conference on Neural Networks, Information and Communication Engineering (NNICE), pages 184–189, 2023.
- OpenPCDet Development Team. Openpcdet: An open-source toolbox for 3d object detection from point clouds. https://github.com/open-mmlab/OpenPCDet, 2020.
- Pointpainting: Sequential fusion for 3d object detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4603–4611, 2020.
- Pointaugmenting: Cross-modal augmentation for 3d object detection. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11789–11798, 2021.
- Transformation-equivariant 3d object detection for autonomous driving. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, AAAI’23/IAAI’23/EAAI’23. AAAI Press, 2023.
- Sparse fuse dense: Towards high quality 3d detection with depth completion. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5408–5417, 2022.
- Fusionpainting: Multimodal fusion with adaptive attention for 3d object detection. In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), pages 3047–3054, 2021.
- Second: Sparsely embedded convolutional detection. Sensors, 18:3337, 10 2018.
- Multimodal virtual point 3d detection. CoRR, abs/2111.06881, 2021.
- Center-based 3d object detection and tracking. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11779–11788, 2021.
- 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection. CoRR, abs/2004.12636, 2020.
- Bisenet V2: bilateral network with guided aggregation for real-time semantic segmentation. CoRR, abs/2004.02147, 2020.
- Voxelnet: End-to-end learning for point cloud based 3d object detection. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4490–4499, 2018.
- Sudip Dhakal (4 papers)
- Dominic Carrillo (4 papers)
- Deyuan Qu (6 papers)
- Michael Nutt (1 paper)
- Qing Yang (138 papers)
- Song Fu (24 papers)