SpotNet: An Image Centric, Lidar Anchored Approach To Long Range Perception (2405.15843v1)
Abstract: In this paper, we propose SpotNet: a fast, single stage, image-centric but LiDAR anchored approach for long range 3D object detection. We demonstrate that our approach to LiDAR/image sensor fusion, combined with the joint learning of 2D and 3D detection tasks, can lead to accurate 3D object detection with very sparse LiDAR support. Unlike more recent bird's-eye-view (BEV) sensor-fusion methods which scale with range $r$ as $O(r2)$, SpotNet scales as $O(1)$ with range. We argue that such an architecture is ideally suited to leverage each sensor's strength, i.e. semantic understanding from images and accurate range finding from LiDAR data. Finally we show that anchoring detections on LiDAR points removes the need to regress distances, and so the architecture is able to transfer from 2MP to 8MP resolution images without re-training.
- Federal Motor Carrier Safety Administration. Long stopping distances, federal mo- tor carrier safety administration. https://shorturl.at/28cpg, 2022.
- Aurora. Fmcw lidar: The self-driving game-changer. link: https://blog.aurora.tech/engineering/fmcw-lidar-the-self-driving-game-changer. 2020.
- Aurora. Firstlight lidar—on a chip. 2023.
- Monoloco: Monocular 3d pedestrian localization and uncertainty estimation, 2019.
- Intentnet: Learning to predict intention from raw sensor data. In Aude Billard, Anca Dragan, Jan Peters, and Jun Morimoto, editors, Proceedings of The 2nd Conference on Robot Learning, volume 87 of Proceedings of Machine Learning Research, pages 947–956. PMLR, 29–31 Oct 2018.
- Multi-view 3d object detection network for autonomous driving, 2016.
- Multi-view fusion of sensor data for improved perception and prediction in autonomous driving. In 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3292–3300, Los Alamitos, CA, USA, jan 2022. IEEE Computer Society.
- Mono3d++: Monocular 3d vehicle detection with two-scale 3d hypotheses and task priors, 2019.
- Joint 3d proposal generation and object detection from view aggregation, 2017.
- Pointpillars: Fast encoders for object detection from point clouds, 2018.
- Centermask : Real-time anchor-free instance segmentation, 2020.
- Gs3d: An efficient 3d object detection framework for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
- Multi-task multi-sensor fusion for 3d object detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
- Deep continuous fusion for multi-sensor 3d object detection. In The European Conference on Computer Vision (ECCV), September 2018.
- Pnpnet: End-to-end perception and prediction with tracking in the loop. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- Bevfusion: A simple and robust lidar-camera fusion framework, 2022.
- Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP:1–1, 07 2018.
- Fast and furious: Real time end-to-end 3d detection, tracking and motion forecasting with a single convolutional net. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Sensor fusion for joint 3d object detection and semantic segmentation, 2019.
- Lasernet: An efficient probabilistic 3d object detector for autonomous driving, 2019.
- 3d bounding box estimation using deep learning and geometry. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7074–7082, 2017.
- California Department of Transportation. Move over law. https://dot.ca.gov/news-releases/news-release-2019-080, 2019.
- Andreas Pfeuffer and Klaus C. J. Dietmayer. Robust semantic segmentation in adverse weather conditions by means of sensor data fusion. 2019 22th International Conference on Information Fusion (FUSION), pages 1–8, 2019.
- Linear fmcw laser radar for precision range and vector velocity measurements. Mrs Online Proceedings Library Archive, 1076:1076–K04–06, Feb 2011.
- Pointnet: Deep learning on point sets for 3d classification and segmentation, 2016. cite arxiv:1612.00593.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space, 2017.
- Monogrnet: A geometric reasoning network for monocular 3d object localization, 2018.
- Triangulation learning network: from monocular to stereo 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7615–7623, 2019.
- Pointrcnn: 3d object proposal generation and detection from point cloud, 2018.
- Pointpainting: Sequential fusion for 3d object detection, 2019.
- Deep parametric continuous convolutional neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving, 2018.
- Multi-level fusion based 3d object detection from monocular images. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Hdnet: Exploiting hd maps for 3d object detection. In Aude Billard, Anca Dragan, Jan Peters, and Jun Morimoto, editors, Proceedings of The 2nd Conference on Robot Learning, volume 87 of Proceedings of Machine Learning Research, pages 146–155. PMLR, 29–31 Oct 2018.
- Objects as points, 2019.
- Voxelnet: End-to-end learning for point cloud based 3d object detection, 2017.