Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection (2403.05817v3)

Published 9 Mar 2024 in cs.CV

Abstract: LiDAR-based 3D object detection plays an essential role in autonomous driving. Existing high-performing 3D object detectors usually build dense feature maps in the backbone network and prediction head. However, the computational costs introduced by the dense feature maps grow quadratically as the perception range increases, making these models hard to scale up to long-range detection. Some recent works have attempted to construct fully sparse detectors to solve this issue; nevertheless, the resulting models either rely on a complex multi-stage pipeline or exhibit inferior performance. In this work, we propose SAFDNet, a straightforward yet highly effective architecture, tailored for fully sparse 3D object detection. In SAFDNet, an adaptive feature diffusion strategy is designed to address the center feature missing problem. We conducted extensive experiments on Waymo Open, nuScenes, and Argoverse2 datasets. SAFDNet performed slightly better than the previous SOTA on the first two datasets but much better on the last dataset, which features long-range detection, verifying the efficacy of SAFDNet in scenarios where long-range detection is required. Notably, on Argoverse2, SAFDNet surpassed the previous best hybrid detector HEDNet by 2.6% mAP while being 2.1x faster, and yielded 2.1% mAP gains over the previous best sparse detector FSDv2 while being 1.3x faster. The code will be available at https://github.com/zhanggang001/HEDNet.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Chauffeurnet: Learning to drive by imitating the best and synthesizing the worst. arXiv preprint arXiv:1812.03079, 2018.
  2. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In ICRA, 2017.
  3. Voxelnet: End-to-end learning for point cloud based 3d object detection. In CVPR, 2018.
  4. Center-based 3d object detection and tracking. In CVPR, 2021.
  5. Transfusion: Robust lidar-camera fusion for 3d object detection with transformers. In CVPR, 2022.
  6. Dsvt: Dynamic sparse voxel transformer with rotated sets. In CVPR, 2023.
  7. HEDNet: A hierarchical encoder-decoder network for 3d object detection in point clouds. In NeurIPS, 2023.
  8. nuscenes: A multimodal dataset for autonomous driving. In CVPR, 2020.
  9. Scalability in perception for autonomous driving: Waymo open dataset. In CVPR, 2020.
  10. Fully Sparse 3D Object Detection. In NeurIPS, 2022.
  11. Swformer: Sparse window transformer for 3d object detection in point clouds. In ECCV, 2022.
  12. Fsd v2: Improving fully sparse 3d object detection with virtual voxels. arXiv preprint arXiv:2308.03755, 2023.
  13. Voxelnext: Fully sparse voxelnet for 3d object detection and tracking. In CVPR, 2023.
  14. Argoverse 2: Next generation datasets for self-driving perception and forecasting. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
  15. Pointpillars: Fast encoders for object detection from point clouds. In CVPR, 2019.
  16. Pillarnet: Real-time and high-performance pillar-based 3d object detection. In ECCV, 2022.
  17. Pillarnext: Rethinking network designs for 3d object detection in lidar point clouds. In CVPR, 2023.
  18. Second: Sparsely embedded convolutional detection. In Sensors, 2018.
  19. Centerformer: Center-based transformer for 3d object detection. In ECCV, 2022.
  20. Scaling up kernels in 3d cnns. In CVPR, 2023.
  21. Embracing Single Stride 3D Object Detector with Sparse Transformer. In CVPR, 2022.
  22. Focal sparse convolutional networks for 3d object detection. In CVPR, 2022.
  23. Afdetv2: Rethinking the necessity of the second stage for object detection from point clouds. In AAAI, 2022.
  24. Voxel transformer for 3d object detection. ICCV, 2021.
  25. Efficient transformer-based 3d object detection with dynamic token halting. In ICCV, 2023.
  26. Pointrcnn: 3d object proposal generation and detection from point cloud. In CVPR, 2019.
  27. Fast point r-cnn. In ICCV, 2019.
  28. Deep hough voting for 3d object detection in point clouds. In ICCV, 2019.
  29. Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, 2017.
  30. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In NeurIPS, 2017.
  31. Benjamin Graham and Laurens Van der Maaten. Submanifold sparse convolutional networks. arXiv preprint arXiv:1706.01307, 2017.
  32. Ben Graham. Sparse 3d convolutional neural networks. arXiv preprint arXiv:1505.02890, 2015.
  33. Spconv Contributors. Spconv: Spatially sparse convolution library. https://github.com/traveller59/spconv, 2022.
  34. From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network. In TPAMI, 2019.
  35. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In CVPR, 2020.
  36. Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In CVPR, 2022.
  37. Pv-rcnn++: Point-voxel feature set abstraction with local vector representation for 3d object detection. In IJCV, 2023.
  38. 3dssd: Point-based 3d single stage object detector. In CVPR, 2020.
  39. Object as hotspots: An anchor-free 3d object detection approach via firing of hotspots. arXiv preprint arXiv:1912.12791, 2020.
  40. Cross view capture for stereo image super-resolution. In IEEE Transactions on Multimedia, 2022.
  41. Unifying voxel-based representation with transformer for 3d object detection. In NIPS, 2022.
  42. Vista: Boosting 3d object detection via dual cross-view spatial attention. In CVPR, 2022.
  43. Link: Linear kernel for lidar-based 3d perception. In CVPR, 2023.
  44. OpenPCDet Development Team. Openpcdet: An open-source toolbox for 3d object detection from point clouds. https://github.com/open-mmlab/OpenPCDet, 2020.
  45. MMDetection3D Contributors. MMDetection3D: OpenMMLab next-generation platform for general 3D object detection. https://github.com/open-mmlab/mmdetection3d, 2020.
  46. Adam: A method for stochastic optimization. In ICLR, 2015.
Citations (9)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com