SPGroup3D: Superpoint Grouping Network for Indoor 3D Object Detection (2312.13641v1)
Abstract: Current 3D object detection methods for indoor scenes mainly follow the voting-and-grouping strategy to generate proposals. However, most methods utilize instance-agnostic groupings, such as ball query, leading to inconsistent semantic information and inaccurate regression of the proposals. To this end, we propose a novel superpoint grouping network for indoor anchor-free one-stage 3D object detection. Specifically, we first adopt an unsupervised manner to partition raw point clouds into superpoints, areas with semantic consistency and spatial similarity. Then, we design a geometry-aware voting module that adapts to the centerness in anchor-free detection by constraining the spatial relationship between superpoints and object centers. Next, we present a superpoint-based grouping module to explore the consistent representation within proposals. This module includes a superpoint attention layer to learn feature interaction between neighboring superpoints, and a superpoint-voxel fusion layer to propagate the superpoint-level information to the voxel level. Finally, we employ effective multiple matching to capitalize on the dynamic receptive fields of proposals based on superpoints during the training. Experimental results demonstrate our method achieves state-of-the-art performance on ScanNet V2, SUN RGB-D, and S3DIS datasets in the indoor one-stage 3D object detection. Source code is available at https://github.com/zyrant/SPGroup3D.
- 3D Semantic Parsing of Large-Scale Indoor Spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1534–1543.
- End-to-End Object Detection with Transformers. In Proceedings of the European Conference on Computer Vision, 213–229.
- SASA: Semantics-Augmented Set Abstraction for Point-Based 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, 221–229.
- A Hierarchical Graph Network for 3D Object Detection on Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 392–401.
- Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8963–8972.
- 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3075–3084.
- Contributors, M. 2020. MMDetection3D: OpenMMLab next-generation platform for general 3D object detection. https://github.com/open-mmlab/mmdetection3d.
- ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5828–5839.
- 3D-MPA: Multi-Proposal Aggregation for 3D Semantic Instance Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9031–9040.
- SEFormer: Structure Embedding Transformer for 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, 632–640.
- Generative Sparse Detection Networks for 3D Single-Shot Object Detection. In Proceedings of the European Conference on Computer Vision, 297–313.
- SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds. In Proceedings of the AAAI Conference on Artificial Intelligence, 870–878.
- Efficient LiDAR Point Cloud Oversegmentation Network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 18003–18012.
- Learning Superpoint Graph Cut for 3D Instance Segmentation. Advances in Neural Information Processing Systems, 36804–36817.
- Superpoint Network for Point Cloud Oversegmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 5510–5519.
- Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980.
- Point Cloud Oversegmentation with Graph-Structured Deep Metric Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7440–7449.
- Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4558–4567.
- Distribution Aware VoteNet for 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, 1583–1591.
- Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2783–2792.
- Focal Loss for Dense Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2980–2988.
- Group-Free 3D Object Detection via Transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2949–2958.
- Deep Hough Voting for 3D Object Detection in Point Clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9277–9286.
- PointNet: Deep learning on Point sets for 3D Classification and Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 652–660.
- PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Advances in Neural Information Processing Systems, 4490–4499.
- FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection. In Proceedings of the European Conference on Computer Vision, 477–493.
- TR3D: Towards Real-Time Indoor 3D Object Detection. arXiv preprint arXiv:2302.02858.
- Self-Supervised 3D Scene Flow Estimation Guided by Superpoints. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5271–5280.
- PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10529–10538.
- SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 567–576.
- Correlation Field for Boosting 3D Object Detection in Structured Scenes. In Proceedings of the AAAI Conference on Artificial Intelligence, 2298–2306.
- Superpoint Transformer for 3D Scene Instance Segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, 2393–2401.
- Learning Inter-Superpoint Affinity for Weakly Supervised 3D Instance Segmentation. In Proceedings of the Asian Conference on Computer Vision, 1282–1297.
- FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9627–9636.
- Attention Is all You Need. Advances in Neural Information Processing Systems, 6000–6010.
- CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds. Advances in Neural Information Processing Systems, 29975–29988.
- RBGNet: Ray-Based Grouping for 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1110–1119.
- Transformation-Equivariant 3D Object Detection for Autonomous Driving. In Proceedings of the AAAI Conference on Artificial Intelligence, 2795–2802.
- VENet: Voting Enhancement Network for 3D Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3712–3721.
- MlCVNet: Multi-Level Context VoteNet for 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10447–10456.
- H3DNet: 3D Object Detection Using Hybrid Geometric Primitives. In Proceedings of the European Conference on Computer Vision, 311–329.
- HyperDet3D: Learning a Scene-conditioned 3D Object Detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5585–5594.
- Distance-IoU loss: Faster and Better Learning for Bounding Box Regression. In Proceedings of the AAAI Conference on Artificial Intelligence, 12993–13000.
- VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4490–4499.