Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MEDL-U: Uncertainty-aware 3D Automatic Annotation based on Evidential Deep Learning (2309.09599v3)

Published 18 Sep 2023 in cs.CV, cs.LG, and cs.RO

Abstract: Advancements in deep learning-based 3D object detection necessitate the availability of large-scale datasets. However, this requirement introduces the challenge of manual annotation, which is often both burdensome and time-consuming. To tackle this issue, the literature has seen the emergence of several weakly supervised frameworks for 3D object detection which can automatically generate pseudo labels for unlabeled data. Nevertheless, these generated pseudo labels contain noise and are not as accurate as those labeled by humans. In this paper, we present the first approach that addresses the inherent ambiguities present in pseudo labels by introducing an Evidential Deep Learning (EDL) based uncertainty estimation framework. Specifically, we propose MEDL-U, an EDL framework based on MTrans, which not only generates pseudo labels but also quantifies the associated uncertainties. However, applying EDL to 3D object detection presents three primary challenges: (1) relatively lower pseudolabel quality in comparison to other autolabelers; (2) excessively high evidential uncertainty estimates; and (3) lack of clear interpretability and effective utilization of uncertainties for downstream tasks. We tackle these issues through the introduction of an uncertainty-aware IoU-based loss, an evidence-aware multi-task loss function, and the implementation of a post-processing stage for uncertainty refinement. Our experimental results demonstrate that probabilistic detectors trained using the outputs of MEDL-U surpass deterministic detectors trained using outputs from previous 3D annotators on the KITTI val set for all difficulty levels. Moreover, MEDL-U achieves state-of-the-art results on the KITTI official test set compared to existing 3D automatic annotators.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. R. Qian, X. Lai, and X. Li, “3d object detection for autonomous driving: A survey,” ArXiv, vol. abs/2106.10823, 2021.
  2. S. Song, S. P. Lichtenberg, and J. Xiao, “Sun rgb-d: A rgb-d scene understanding benchmark suite,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 567–576, 2015.
  3. P. Wang, X. Huang, X. Cheng, D. Zhou, Q. Geng, and R. Yang, “The apolloscape open dataset for autonomous driving and its application,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, pp. 2702–2719, 2018.
  4. Y. Wei, S.-C. Su, J. Lu, and J. Zhou, “Fgr: Frustum-aware geometric reasoning for weakly supervised 3d vehicle detection,” 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 4348–4354, 2021.
  5. C. Liu, X. Qian, X. Qi, E. Y. Lam, S.-C. Tan, and N. Wong, “Map-gen: An automated 3d-box annotation flow with multimodal attention point generator,” 2022 26th International Conference on Pattern Recognition (ICPR), pp. 1148–1155, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:247779303
  6. C. Liu, X. Qian, B. Huang, X. Qi, E. Y. Lam, S.-C. Tan, and N. Wong, “Multimodal transformer for automatic 3d annotation and object detection,” in European Conference on Computer Vision, 2022.
  7. X. Qian, C. Liu, X. Qi, S.-C. Tan, E. Y. Lam, and N. Wong, “Context-aware transformer for 3d point cloud automatic annotation,” in AAAI Conference on Artificial Intelligence, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:257766591
  8. A. Amini, W. Schwarting, A. P. Soleimany, and D. Rus, “Deep evidential regression,” ArXiv, vol. abs/1910.02600, 2019.
  9. C. Wang, X. Wang, J. Zhang, L. Zhang, X. Bai, X. Ning, J. Zhou, and E. R. Hancock, “Uncertainty estimation for stereo matching based on evidential deep learning,” Pattern Recognit., vol. 124, p. 108498, 2021.
  10. W. Bao, Q. Yu, and Y. Kong, “Evidential deep learning for open set action recognition,” 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 13 329–13 338, 2021.
  11. J. Gawlikowski, S. Saha, A. M. Kruspe, and X. X. Zhu, “An advanced dirichlet prior network for out-of-distribution detection in remote sensing,” IEEE Transactions on Geoscience and Remote Sensing, vol. PP, pp. 1–1, 2022.
  12. Q. Meng, W. Wang, T. Zhou, J. Shen, Y. Jia, and L. V. Gool, “Towards a weakly supervised framework for 3d point cloud object detection and annotation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, pp. 4454–4468, 2021.
  13. D. Yin, H. Yu, N. Liu, F. Yao, Q. He, J. Li, Y. Yang, S. Yan, and X. Sun, “Gal: Graph-induced adaptive learning for weakly supervised 3d object detection,” IEEE Transactions on Intelligent Transportation Systems, 2023.
  14. Z. Qin, J. Wang, and Y. Lu, “Weakly supervised 3d object detection from point clouds,” Proceedings of the 28th ACM International Conference on Multimedia, 2020.
  15. H. Liu, H. Ma, Y. Wang, B. Zou, T. Hu, R. Wang, and J. Chen, “Eliminating spatial ambiguity for weakly supervised 3d object detection without spatial labels,” Proceedings of the 30th ACM International Conference on Multimedia, 2022.
  16. A. P. Soleimany, A. Amini, S. Goldman, D. Rus, S. N. Bhatia, and C. W. Coley, “Evidential deep learning for guided molecular property prediction and discovery,” ACS Central Science, vol. 7, pp. 1356 – 1367, 2021.
  17. Y. Zhang, Q. Zhang, Z. Zhu, J. Hou, and Y. Yuan, “Glenet: Boosting 3d object detectors with generative label uncertainty estimation,” ArXiv, vol. abs/2207.02466, 2022.
  18. D. Oh and B. Shin, “Improving evidential deep learning via multi-task learning,” in AAAI Conference on Artificial Intelligence, 2021.
  19. A. Kendall and Y. Gal, “What uncertainties do we need in bayesian deep learning for computer vision?” ArXiv, vol. abs/1703.04977, 2017.
  20. Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-iou loss: Faster and better learning for bounding box regression,” in AAAI Conference on Artificial Intelligence, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:208158250
  21. D. Feng, Z. Wang, Y. Zhou, L. Rosenbaum, F. Timm, K. C. J. Dietmayer, M. Tomizuka, and W. Zhan, “Labels are not perfect: Inferring spatial uncertainty in object detection,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, pp. 9981–9994, 2020.
  22. Y. He, C. Zhu, J. Wang, M. Savvides, and X. Zhang, “Bounding box regression with uncertainty for accurate object detection,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2883–2892, 2018.
  23. A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361, 2012.
  24. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Neural Information Processing Systems, 2019.
  25. Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” ArXiv, vol. abs/1506.02142, 2015.
  26. Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors (Basel, Switzerland), vol. 18, 2018.
  27. A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12 689–12 697, 2018.
  28. W. Zheng, W. Tang, S. Chen, L. Jiang, and C.-W. Fu, “Cia-ssd: Confident iou-aware single-stage object detector from point cloud,” in AAAI Conference on Artificial Intelligence, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:227335169
  29. J. Deng, S. Shi, P.-C. Li, W. gang Zhou, Y. Zhang, and H. Li, “Voxel r-cnn: Towards high performance voxel-based 3d object detection,” ArXiv, vol. abs/2012.15712, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:229923684
  30. S. Shi, X. Wang, and H. Li, “Pointrcnn: 3d object proposal generation and detection from point cloud,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–779, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:54607410
  31. Q. Meng, W. Wang, T. Zhou, J. Shen, L. V. Gool, and D. Dai, “Weakly supervised 3d object detection from lidar point cloud,” in European Conference on Computer Vision, 2020.

Summary

We haven't generated a summary for this paper yet.