Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud? (2403.02818v1)

Published 5 Mar 2024 in cs.CV

Abstract: Current state-of-the-art (SOTA) 3D object detection methods often require a large amount of 3D bounding box annotations for training. However, collecting such large-scale densely-supervised datasets is notoriously costly. To reduce the cumbersome data annotation process, we propose a novel sparsely-annotated framework, in which we just annotate one 3D object per scene. Such a sparse annotation strategy could significantly reduce the heavy annotation burden, while inexact and incomplete sparse supervision may severely deteriorate the detection performance. To address this issue, we develop the SS3D++ method that alternatively improves 3D detector training and confident fully-annotated scene generation in a unified learning scheme. Using sparse annotations as seeds, we progressively generate confident fully-annotated scenes based on designing a missing-annotated instance mining module and reliable background mining module. Our proposed method produces competitive results when compared with SOTA weakly-supervised methods using the same or even more annotation costs. Besides, compared with SOTA fully-supervised methods, we achieve on-par or even better performance on the KITTI dataset with about 5x less annotation cost, and 90% of their performance on the Waymo dataset with about 15x less annotation cost. The additional unlabeled training scenes could further boost the performance. The code will be available at https://github.com/gaocq/SS3D2.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. J. Shu, X. Yuan, D. Meng, and Z. Xu, “Dac-mr: Data augmentation consistency based meta-regularization for meta-learning,” arXiv preprint arXiv:2305.07892, 2023.
  2. J. Shu, D. Meng, and Z. Xu, “Learning an explicit hyperparameter prediction function conditioned on tasks,” Journal of Machine Learning Research, 2023.
  3. R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021.
  4. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
  5. J. Shu, X. Yuan, and D. Meng, “Cmw-net: an adaptive robust algorithm for sample selection and label correction,” National Science Review, vol. 10, no. 6, p. nwad084, 2023.
  6. S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “Pv-rcnn: Point-voxel feature set abstraction for 3d object detection,” in CVPR, 2020, pp. 10 529–10 538.
  7. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in ICCV, 2017, pp. 2980–2988.
  8. J. Shu, D. Meng, and Z. Xu, “Meta self-paced learning,” Scientia Sinica Informationis, vol. 50, no. 6, pp. 781–793, 2020.
  9. P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine et al., “Scalability in perception for autonomous driving: Waymo open dataset,” in CVPR, 2020, pp. 2446–2454.
  10. Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, p. 3337, 2018.
  11. H. Wang, Y. Cong, O. Litany, Y. Gao, and L. J. Guibas, “3dioumatch: Leveraging iou prediction for semi-supervised 3d object detection,” in CVPR, 2021, pp. 14 615–14 624.
  12. W. Zheng, W. Tang, L. Jiang, and C.-W. Fu, “Se-ssd: Self-ensembling single-stage object detector from point cloud,” in CVPR, 2021, pp. 14 494–14 503.
  13. A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” Int. J. Rob. Res., vol. 32, no. 11, pp. 1231–1237, 2013.
  14. S. Shi, X. Wang, and H. Li, “Pointrcnn: 3d object proposal generation and detection from point cloud,” in CVPR, 2019, pp. 770–779.
  15. S. Shi, Z. Wang, J. Shi, X. Wang, and H. Li, “From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network,” IEEE TPAMI, vol. 43, no. 8, pp. 2647–2664, 2021.
  16. J. Mao, S. Shi, X. Wang, and H. Li, “3d object detection for autonomous driving: A review and new outlooks,” arXiv:2206.09474, 2022.
  17. T. Wang, T. Yang, J. Cao, and X. Zhang, “Co-mining: Self-supervised learning for sparsely annotated object detection,” in AAAI, vol. 35, no. 4, 2021, pp. 2800–2808.
  18. H. Zhang, F. Chen, Z. Shen, Q. Hao, C. Zhu, and M. Savvides, “Solving missing-annotation object detection with background recalibration loss,” in ICASSP, 2020, pp. 1888–1892.
  19. Y. Niitani, T. Akiba, T. Kerola, T. Ogawa, S. Sano, and S. Suzuki, “Sampling techniques for large-scale object detection from sparsely annotated objects,” in CVPR, 2019, pp. 6510–6518.
  20. C. R. Qi, O. Litany, K. He, and L. J. Guibas, “Deep hough voting for 3d object detection in point clouds,” in ICCV, 2019, pp. 9277–9286.
  21. Z. Wu, N. Bodla, B. Singh, M. Najibi, R. Chellappa, and L. S. Davis, “Soft sampling for robust object detection,” in BMVC, 2019, p. 225.
  22. O. D. Team, “Openpcdet: An open-source toolbox for 3d object detection from point clouds,” https://github.com/open-mmlab/OpenPCDet, 2020.
  23. J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel r-cnn: Towards high performance voxel-based 3d object detection,” in AAAI, vol. 35, no. 2, 2021, pp. 1201–1209.
  24. Q. Meng, W. Wang, T. Zhou, J. Shen, Y. Jia, and L. Van Gool, “Towards a weakly supervised framework for 3d point cloud object detection and annotation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp. 4454–4468, 2021.
  25. Z. Yang, Y. Sun, S. Liu, and J. Jia, “3dssd: Point-based 3d single stage object detector,” in CVPR, 2020, pp. 11 040–11 048.
  26. N. Zhao, T.-S. Chua, and G. H. Lee, “Sess: Self-ensembling semi-supervised 3d object detection,” in CVPR, 2020, pp. 11 079–11 087.
  27. A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in CVPR, 2019, pp. 12 697–12 705.
  28. W. Zheng, W. Tang, S. Chen, L. Jiang, and C.-W. Fu, “Cia-ssd: Confident iou-aware single-stage object detector from point cloud,” in AAAI, vol. 35, no. 4, 2021, pp. 3555–3562.
  29. Z. Yang, Y. Sun, S. Liu, X. Shen, and J. Jia, “Std: Sparse-to-dense 3d object detector for point cloud,” in ICCV, 2019, pp. 1951–1960.
  30. Y. Zhang, D. Huang, and Y. Wang, “Pc-rgnn: Point cloud completion and graph neural network for 3d object detection,” in AAAI, vol. 35, no. 4, 2021, pp. 3430–3437.
  31. I. Loshchilov and F. Hutter, “SGDR: stochastic gradient descent with warm restarts,” in ICLR, 2017.
  32. W. Shi and R. Rajkumar, “Point-gnn: Graph neural network for 3d object detection in a point cloud,” in CVPR, 2020, pp. 1711–1719.
  33. C. He, H. Zeng, J. Huang, X.-S. Hua, and L. Zhang, “Structure aware single-stage 3d object detection from point cloud,” in CVPR, 2020, pp. 11 873–11 882.
  34. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in CVPR, 2017, pp. 652–660.
  35. K. Sohn, D. Berthelot, N. Carlini, Z. Zhang, H. Zhang, C. Raffel, E. D. Cubuk, A. Kurakin, and C. Li, “Fixmatch: Simplifying semi-supervised learning with consistency and confidence,” in NeurIPS, 2020.
  36. C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” in NeurIPS, 2017, pp. 5099–5108.
  37. Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in ICML, 2009, pp. 41–48.
  38. M. Shi and V. Ferrari, “Weakly supervised object localization using size estimates,” in ECCV, 2016, pp. 105–121.
  39. J. Wang, X. Wang, and W. Liu, “Weakly-and semi-supervised faster r-cnn with curriculum learning,” in ICPR, 2018, pp. 2416–2421.
  40. R. Liu, J. Gao, J. Zhang, D. Meng, and Z. Lin, “Investigating bi-level optimization for learning and vision from a unified perspective: A survey and beyond,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  41. L. Franceschi, P. Frasconi, S. Salzo, R. Grazzi, and M. Pontil, “Bilevel programming for hyperparameter optimization and meta-learning,” in International Conference on Machine Learning, 2018, pp. 1568–1577.
  42. D. Meng, Q. Zhao, and L. Jiang, “A theoretical understanding of self-paced learning,” Information Sciences, vol. 414, pp. 319–328, 2017.
  43. X. Wang, Y. Chen, and W. Zhu, “A survey on curriculum learning,” TPAMI, 2021.
  44. J. Shu, Q. Xie, L. Yi, Q. Zhao, S. Zhou, Z. Xu, and D. Meng, “Meta-weight-net: Learning an explicit mapping for sample weighting,” in Advances in neural information processing systems, vol. 32, 2019.
  45. J. Shu, X. Yuan, D. Meng, and Z. Xu, “Cmw-net: Learning a class-aware sample weighting mapping for robust deep learning,” PAMI, 2023.
  46. A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results,” in Advances in neural information processing systems, vol. 30, 2017.
  47. P. Soviany, R. T. Ionescu, P. Rota, and N. Sebe, “Curriculum learning: A survey,” arXiv preprint arXiv:2101.10382, 2021.
  48. A. Shrivastava, A. Gupta, and R. Girshick, “Training region-based object detectors with online hard example mining,” in CVPR, 2016, pp. 761–769.
  49. D. Zhang, J. Han, L. Zhao, and D. Meng, “Leveraging prior-knowledge for weakly supervised object detection under a collaborative self-paced curriculum learning framework,” IJCV, vol. 127, no. 4, pp. 363–380, 2019.
  50. L. Jiang, D. Meng, Q. Zhao, S. Shan, and A. G. Hauptmann, “Self-paced curriculum learning,” in AAAI, 2015.
  51. S. Li, X. Zhu, Q. Huang, H. Xu, and C.-C. J. Kuo, “Multiple instance curriculum learning for weakly supervised object detection,” arXiv preprint arXiv:1711.09191, 2017.
  52. C. Liu, C. Gao, F. Liu, J. Liu, D. Meng, and X. Gao, “Ss3d: Sparsely-supervised 3d object detection from point cloud,” in CVPR, 2022, pp. 8428–8437.
  53. M. Kumar, B. Packer, and D. Koller, “Self-paced learning for latent variable models,” in NeurIPS, 2010.
  54. E. Sangineto, M. Nabi, D. Culibrk, and N. Sebe, “Self paced deep learning for weakly supervised object detection,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 3, pp. 712–725, 2018.
  55. P. Soviany, R. T. Ionescu, P. Rota, and N. Sebe, “Curriculum self-paced learning for cross-domain object detection,” Computer Vision and Image Understanding, vol. 204, p. 103166, 2021.
  56. S. Shi, L. Jiang, J. Deng, Z. Wang, C. Guo, J. Shi, X. Wang, and H. Li, “Pv-rcnn++: Point-voxel feature set abstraction with local vector representation for 3d object detection,” arXiv preprint arXiv:2102.00463, 2021.
  57. L. Du, X. Ye, X. Tan, E. Johns, B. Chen, E. Ding, X. Xue, and J. Feng, “Ago-net: Association-guided 3d point cloud object detection network,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  58. X. Dong, L. Zheng, F. Ma, Y. Yang, and D. Meng, “Few-example object detection with model communication,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 7, pp. 1641–1654, 2018.
  59. Y. Wei, S. Su, J. Lu, and J. Zhou, “Fgr: Frustum-aware geometric reasoning for weakly supervised 3d vehicle detection,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 4348–4354.
  60. Z. Qin, J. Wang, and Y. Lu, “Weakly supervised 3d object detection from point clouds,” in Proceedings of the 28th ACM International Conference on Multimedia, 2020, pp. 4144–4152.
  61. N. Bodla, B. Singh, R. Chellappa, and L. S. Davis, “Soft-nms–improving object detection with one line of code,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5561–5569.
  62. X. Wang, R. Zhang, T. Kong, L. Li, and C. Shen, “Solov2: Dynamic and fast instance segmentation,” Advances in Neural information processing systems, vol. 33, pp. 17 721–17 732, 2020.
  63. T. Yin, X. Zhou, and P. Krahenbuhl, “Center-based 3d object detection and tracking,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 11 784–11 793.
  64. C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, “Frustum pointnets for 3d object detection from rgb-d data,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 918–927.
  65. H. Wang, C. Shi, S. Shi, M. Lei, S. Wang, D. He, B. Schiele, and L. Wang, “Dsvt: Dynamic sparse voxel transformer with rotated sets,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 13 520–13 529.
  66. J. Park, C. Xu, Y. Zhou, M. Tomizuka, and W. Zhan, “Detmatch: Two teachers are better than one for joint 2d and 3d semi-supervised object detection,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part X.   Springer, 2022, pp. 370–389.
  67. C. Liu, C. Gao, F. Liu, P. Li, D. Meng, and X. Gao, “Hierarchical supervision and shuffle data augmentation for 3d semi-supervised object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 23 819–23 828.
  68. J. Yin, J. Fang, D. Zhou, L. Zhang, C.-Z. Xu, J. Shen, and W. Wang, “Semi-supervised 3d object detection with proficient teachers,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXVIII.   Springer, 2022, pp. 727–743.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Chenqiang Gao (21 papers)
  2. Chuandong Liu (4 papers)
  3. Jun Shu (19 papers)
  4. Fangcen Liu (10 papers)
  5. Jiang Liu (143 papers)
  6. Luyu Yang (8 papers)
  7. Xinbo Gao (194 papers)
  8. Deyu Meng (182 papers)

Summary

We haven't generated a summary for this paper yet.