Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection (2404.19384v1)
Abstract: Recent self-training techniques have shown notable improvements in unsupervised domain adaptation for 3D object detection (3D UDA). These techniques typically select pseudo labels, i.e., 3D boxes, to supervise models for the target domain. However, this selection process inevitably introduces unreliable 3D boxes, in which 3D points cannot be definitively assigned as foreground or background. Previous techniques mitigate this by reweighting these boxes as pseudo labels, but these boxes can still poison the training process. To resolve this problem, in this paper, we propose a novel pseudo label refinery framework. Specifically, in the selection process, to improve the reliability of pseudo boxes, we propose a complementary augmentation strategy. This strategy involves either removing all points within an unreliable box or replacing it with a high-confidence box. Moreover, the point numbers of instances in high-beam datasets are considerably higher than those in low-beam datasets, also degrading the quality of pseudo labels during the training process. We alleviate this issue by generating additional proposals and aligning RoI features across different domains. Experimental results demonstrate that our method effectively enhances the quality of pseudo labels and consistently surpasses the state-of-the-art methods on six autonomous driving benchmarks. Code will be available at https://github.com/Zhanwei-Z/PERE.
- Unsupervised pixel-level domain adaptation with generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3722–3731, 2017.
- nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020.
- Reusing the task-specific classifier as a discriminator: Discriminator-free adversarial domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7181–7190, 2022.
- Revisiting domain-adaptive 3d object detection by reliable, diverse and class-balanced pseudo-labeling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3714–3726, 2023.
- Part-based pseudo label refinement for unsupervised person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7308–7318, 2022.
- Rangedet: In defense of range view for lidar-based 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2918–2927, 2021.
- Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. In International Conference on Learning Representations, 2019.
- Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition, pages 3354–3361. IEEE, 2012.
- Generative adversarial networks. NeurIPS, 2014.
- Learning transferable parameters for unsupervised domain adaptation. IEEE Transactions on Image Processing, 31:6424–6439, 2022.
- Density-insensitive unsupervised domain adaption on 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17556–17566, 2023.
- Acquisition of localization confidence for accurate object detection. In Proceedings of the European conference on computer vision (ECCV), pages 784–799, 2018.
- D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Voxel-fpn: Multi-scale voxel feature aggregation for 3d object detection from lidar point clouds. Sensors, 20(3):704, 2020.
- Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12697–12705, 2019.
- D.-H. Lee et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, volume 3, page 896, 2013.
- Gs3d: An efficient 3d object detection framework for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1019–1028, 2019.
- Class-balanced pixel-level self-labeling for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11593–11603, 2022.
- Category dictionary guided unsupervised domain adaptation for object detection. In Proceedings of the AAAI conference on artificial intelligence, pages 1949–1957, 2021.
- Gpa-3d: Geometry-aware prototype alignment for unsupervised domain adaptive 3d object detection from point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6394–6403, 2023.
- Unsupervised domain adaptive 3d detection with multi-level consistency. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8866–8875, 2021.
- Cl3d: Unsupervised domain adaptation for cross-lidar 3d detection. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 2047–2055, 2023.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017.
- Simrod: A simple adaptation method for robust object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3570–3579, 2021.
- Online unsupervised domain adaptation for person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3830–3839, 2022.
- Multi-source unsupervised domain adaptation via pseudo target domain. IEEE Transactions on Image Processing, 31:2122–2135, 2022.
- Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015.
- Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10529–10538, 2020.
- From points to parts: 3d object detection from point cloud with part-aware and part-aggregation network. IEEE transactions on pattern analysis and machine intelligence, 43(8):2647–2664, 2020.
- W. Shi and R. Rajkumar. Point-gnn: Graph neural network for 3d object detection in a point cloud. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1711–1719, 2020.
- Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2446–2454, 2020.
- O. D. Team. Openpcdet: An open-source toolbox for 3d object detection from point clouds. https://github.com/open-mmlab/OpenPCDet, 2020.
- M. Thota and G. Leontidis. Contrastive domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2209–2218, 2021.
- Fully convolutional one-stage 3d object detection on lidar range images. NeurIPS, 2022.
- L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- Attentive waveblock: Complementarity-enhanced mutual networks for unsupervised domain adaptation in person re-identification and beyond. IEEE Transactions on Image Processing, 31:1532–1544, 2022.
- Train in germany, test in the usa: Making 3d object detectors generalize. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11713–11723, 2020.
- Lidar distillation: Bridging the beam-induced domain gap for 3d object detection. In ECCV 2022: 17th European Conference, page 179–195, 2022.
- Adaptive adversarial network for source-free domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9010–9019, 2021.
- Second: Sparsely embedded convolutional detection. Sensors, 18(10):3337, 2018.
- St3d: Self-training for unsupervised domain adaptation on 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10368–10378, 2021.
- St3d++: Denoised self-training for unsupervised domain adaptation on 3d object detection. IEEE transactions on pattern analysis and machine intelligence, 45(5):6354–6371, 2022.
- Std: Sparse-to-dense 3d object detector for point cloud. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1951–1960, 2019.
- Learning transferable features for point cloud detection via 3d contrastive co-training. Advances in Neural Information Processing Systems, 34:21493–21504, 2021.
- Center-based 3d object detection and tracking. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11784–11793, 2021.
- Exploiting playbacks in unsupervised domain adaptation for 3d object detection in self-driving cars. In 2022 International Conference on Robotics and Automation (ICRA), pages 5070–5077. IEEE, 2022.
- Learning to detect mobile objects from lidar scans without labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1130–1140, 2022.
- Unsupervised adaptation from repeated traversals for autonomous driving. Advances in Neural Information Processing Systems, 35:27716–27729, 2022.
- Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13834–13844, 2021.
- Srdan: Scale-aware and range-aware domain adaptation network for cross-dataset 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6769–6779, 2021.
- Y. Zhou and O. Tuzel. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4490–4499, 2018.
- Zhanwei Zhang (8 papers)
- Minghao Chen (37 papers)
- Shuai Xiao (31 papers)
- Liang Peng (55 papers)
- Hengjia Li (15 papers)
- Binbin Lin (50 papers)
- Ping Li (421 papers)
- Wenxiao Wang (63 papers)
- Boxi Wu (36 papers)
- Deng Cai (181 papers)