Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels (2106.13381v1)

Published 25 Jun 2021 in cs.CV

Abstract: 3D object detection is vital for many robotics applications. For tasks where a 2D perspective range image exists, we propose to learn a 3D representation directly from this range image view. To this end, we designed a 2D convolutional network architecture that carries the 3D spherical coordinates of each pixel throughout the network. Its layers can consume any arbitrary convolution kernel in place of the default inner product kernel and exploit the underlying local geometry around each pixel. We outline four such kernels: a dense kernel according to the bag-of-words paradigm, and three graph kernels inspired by recent graph neural network advances: the Transformer, the PointNet, and the Edge Convolution. We also explore cross-modality fusion with the camera image, facilitated by operating in the perspective range image view. Our method performs competitively on the Waymo Open Dataset and improves the state-of-the-art AP for pedestrian detection from 69.7% to 75.5%. It is also efficient in that our smallest model, which still outperforms the popular PointPillars in quality, requires 180 times fewer FLOPS and model parameters

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yuning Chai (25 papers)
  2. Pei Sun (49 papers)
  3. Jiquan Ngiam (17 papers)
  4. Weiyue Wang (23 papers)
  5. Benjamin Caine (10 papers)
  6. Vijay Vasudevan (24 papers)
  7. Xiao Zhang (435 papers)
  8. Dragomir Anguelov (73 papers)
Citations (65)

Summary

We haven't generated a summary for this paper yet.