Papers
Topics
Authors
Recent
2000 character limit reached

Voxel or Pillar: Exploring Efficient Point Cloud Representation for 3D Object Detection (2304.02867v2)

Published 6 Apr 2023 in cs.CV

Abstract: Efficient representation of point clouds is fundamental for LiDAR-based 3D object detection. While recent grid-based detectors often encode point clouds into either voxels or pillars, the distinctions between these approaches remain underexplored. In this paper, we quantify the differences between the current encoding paradigms and highlight the limited vertical learning within. To tackle these limitations, we introduce a hybrid Voxel-Pillar Fusion network (VPF), which synergistically combines the unique strengths of both voxels and pillars. Specifically, we first develop a sparse voxel-pillar encoder that encodes point clouds into voxel and pillar features through 3D and 2D sparse convolutions respectively, and then introduce the Sparse Fusion Layer (SFL), facilitating bidirectional interaction between sparse voxel and pillar features. Our efficient, fully sparse method can be seamlessly integrated into both dense and sparse detectors. Leveraging this powerful yet straightforward framework, VPF delivers competitive performance, achieving real-time inference speeds on the nuScenes and Waymo Open Dataset. The code will be available.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection. In CoRL.
  2. nuscenes: A multimodal dataset for autonomous driving. In CVPR.
  3. Every view counts: Cross-view consistency in 3d object detection with hybrid-cylindrical-spherical voxelization. In Neurips.
  4. Object as hotspots: An anchor-free 3d object detection approach via firing of hotspots. In ECCV.
  5. Multi-View 3D Object Detection Network for Autonomous Driving. In CVPR.
  6. Dual path networks. In NeurIPS.
  7. Focal sparse convolutional networks for 3d object detection. In CVPR.
  8. Scaling up kernels in 3d cnns. arXiv preprint.
  9. VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking. In CVPR.
  10. Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection. In AAAI.
  11. Vista: Boosting 3d object detection via dual cross-view spatial attention. In CVPR.
  12. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In ICLR.
  13. Embracing single stride 3d object detector with sparse transformer. In CVPR.
  14. Fully Sparse 3D Object Detection. In NeurIPS.
  15. RangeDet: In Defense of Range View for LiDAR-based 3D Object Detection. In ICCV.
  16. SlowFast Networks for Video Recognition. In ICCV.
  17. Fast Graph Representation Learning with PyTorch Geometric. In ICLRW.
  18. Graham, B. 2015. Sparse 3D convolutional neural networks. In BMVC.
  19. Submanifold Sparse Convolutional Networks. arXiv preprint.
  20. PillarNet: Real-Time and High-Performance Pillar-based 3D Object Detection. In ECCV.
  21. Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds. In CVPR.
  22. Afdetv2: Rethinking the necessity of the second stage for object detection from point clouds. In AAAI.
  23. End-to-End Object Detection with Transformers. In ECCV.
  24. Pointpillars: Fast encoders for object detection from point clouds. In CVPR.
  25. Scale-Aware Trident Networks for Object Detection. In ICCV.
  26. DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection. In CVPR.
  27. LiDAR R-CNN: An Efficient and Universal 3D Object Detector. In CVPR.
  28. Multi-Task Multi-Sensor Fusion for 3D Object Detection. In CVPR.
  29. Feature Pyramid Networks for Object Detection. In CVPR.
  30. Focal loss for dense object detection. In ICCV.
  31. LinK: Linear Kernel for LiDAR-based 3D Perception. In CVPR.
  32. Lasernet: An efficient probabilistic 3d object detector for autonomous driving. In CVPR.
  33. CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection. In IROS.
  34. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In CVPR.
  35. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In NeurIPS.
  36. Deep Hough Voting for 3D Object Detection in Point Clouds. In ICCV.
  37. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE TPAMI.
  38. Improving 3D Object Detection with Channel-wise Transformer. In ICCV.
  39. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In CVPR.
  40. PV-RCNN++: Point-voxel feature set abstraction with local vector representation for 3D object detection. IJCV.
  41. PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud. In CVPR.
  42. From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network. IEEE TPAMI.
  43. Scalability in perception for autonomous driving: Waymo open dataset. In CVPR.
  44. Rsn: Range sparse net for efficient, accurate lidar 3d object detection. In CVPR.
  45. Attention is All you Need. In NeurIPS.
  46. Dsvt: Dynamic sparse voxel transformer with rotated sets. In CVPR.
  47. Pillar-based object detection for autonomous driving. In ECCV.
  48. Aggregated residual transformations for deep neural networks. In CVPR.
  49. Semi-supervised person re-identification using multi-view clustering. PR.
  50. SECOND: Sparsely Embedded Convolutional Detection. Sensors.
  51. PIXOR: Real-time 3D Object Detection from Point Clouds. In CVPR.
  52. GD-MAE: generative decoder for MAE pre-training on lidar point clouds. In CVPR.
  53. 3DSSD: Point-based 3D Single Stage Object Detector. In CVPR.
  54. STD: Sparse-to-Dense 3D Object Detector for Point Cloud. In ICCV.
  55. Center-based 3d object detection and tracking. In CVPR.
  56. 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection. In ECCV.
  57. Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds. In CVPR.
  58. CIA-SSD: Confident IoU-Aware Single-Stage Object Detector From Point Cloud. In AAAI.
  59. Distance-IoU loss: Faster and better learning for bounding box regression. In AAAI.
  60. Inverse Adversarial Diversity Learning for Network Ensemble. IEEE TNNLS.
  61. Hierarchical and interactive refinement network for edge-preserving salient object detection. IEEE TIP.
  62. End-to-end multi-view fusion for 3d object detection in lidar point clouds. In CoRL.
  63. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In CVPR.
  64. Centerformer: Center-based transformer for 3d object detection. In ECCV.
  65. Class-balanced grouping and sampling for point cloud 3d object detection. arXiv preprint.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.