Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

What Matters in Range View 3D Object Detection (2407.16789v2)

Published 23 Jul 2024 in cs.CV, cs.AI, and cs.LG

Abstract: Lidar-based perception pipelines rely on 3D object detection models to interpret complex scenes. While multiple representations for lidar exist, the range-view is enticing since it losslessly encodes the entire lidar sensor output. In this work, we achieve state-of-the-art amongst range-view 3D object detection models without using multiple techniques proposed in past range-view literature. We explore range-view 3D object detection across two modern datasets with substantially different properties: Argoverse 2 and Waymo Open. Our investigation reveals key insights: (1) input feature dimensionality significantly influences the overall performance, (2) surprisingly, employing a classification loss grounded in 3D spatial proximity works as well or better compared to more elaborate IoU-based losses, and (3) addressing non-uniform lidar density via a straightforward range subsampling technique outperforms existing multi-resolution, range-conditioned networks. Our experiments reveal that techniques proposed in recent range-view literature are not needed to achieve state-of-the-art performance. Combining the above findings, we establish a new state-of-the-art model for range-view 3D object detection -- improving AP by 2.2% on the Waymo Open dataset while maintaining a runtime of 10 Hz. We establish the first range-view model on the Argoverse 2 dataset and outperform strong voxel-based baselines. All models are multi-class and open-source. Code is available at https://github.com/benjaminrwilson/range-view-3d-detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12677–12686, 2019.
  2. Rangeperception: Taming lidar range view for efficient and accurate 3d object detection. Advances in Neural Information Processing Systems, 36, 2024.
  3. Fully convolutional one-stage 3d object detection on lidar range images. Advances in Neural Information Processing Systems, 35:34899–34911, 2022.
  4. To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2021.
  5. RangeDet: In Defense of Range View for LiDAR-Based 3D Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2918–2927, 2021.
  6. Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting. Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 1, Dec. 2021.
  7. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2446–2454, 2020.
  8. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 652–660, 2017a.
  9. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017b.
  10. Pointrcnn: 3d object proposal generation and detection from point cloud. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–779, 2019.
  11. Deep Sets. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  12. Vehicle Detection from 3D Lidar Using Fully Convolutional Network, Aug. 2016.
  13. MVFuseNet: Improving End-to-End Object Detection and Motion Forecasting Through Multi-View Fusion of LiDAR Data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2865–2874, 2021.
  14. PointPillars: Fast Encoders for Object Detection From Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12697–12705, 2019.
  15. Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2349–2357, 2022.
  16. Y. Zhou and O. Tuzel. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4490–4499, 2018.
  17. Second: Sparsely embedded convolutional detection. Sensors, 18(10):3337, Oct 2018. ISSN 1424-8220. doi:10.3390/s18103337. URL http://dx.doi.org/10.3390/s18103337.
  18. Intentnet: Learning to predict intention from raw sensor data, 2021.
  19. End-to-end multi-view fusion for 3d object detection in lidar point clouds. In Conference on Robot Learning, pages 923–932. PMLR, 2020.
  20. RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5725–5734, 2021.
  21. A. Singh. Vision-radar fusion for robotics bev detections: A survey. arXiv preprint arXiv:2302.06643, 2023.
  22. LaserFlow: Efficient and Probabilistic Object Detection and Motion Forecasting. IEEE Robotics and Automation Letters, 6(2):526–533, Apr. 2021. ISSN 2377-3766. doi:10.1109/LRA.2020.3047793.
  23. RV-FuseNet: Range View Based Fusion of Time-Series LiDAR Data for Joint 3D Object Detection and Motion Forecasting. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7060–7066, Sept. 2021. doi:10.1109/IROS51168.2021.9636083.
  24. Multi-view 3d object detection network for autonomous driving. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6526–6534, 2017. doi:10.1109/CVPR.2017.691.
  25. An intriguing failing of convolutional neural networks and the coordconv solution. Advances in neural information processing systems, 31, 2018.
  26. VarifocalNet: An IoU-Aware Dense Object Detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8514–8523, 2021.
  27. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017.
  28. Voxelnext: Fully sparse voxelnet for 3d object detection and tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21674–21683, 2023.
  29. Center-Based 3D Object Detection and Tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11784–11793, 2021.
  30. Fully Sparse 3D Object Detection, Oct. 2022.
  31. Swformer: Sparse window transformer for 3d object detection in point clouds. In European Conference on Computer Vision, pages 426–442. Springer, 2022.
  32. Deep Layer Aggregation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2403–2412, 2018.
  33. A study of bfloat16 for deep learning training. arXiv preprint arXiv:1905.12322, 2019.

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com