LiDAR-UDA: Self-ensembling Through Time for Unsupervised LiDAR Domain Adaptation (2309.13523v1)
Abstract: We introduce LiDAR-UDA, a novel two-stage self-training-based Unsupervised Domain Adaptation (UDA) method for LiDAR segmentation. Existing self-training methods use a model trained on labeled source data to generate pseudo labels for target data and refine the predictions via fine-tuning the network on the pseudo labels. These methods suffer from domain shifts caused by different LiDAR sensor configurations in the source and target domains. We propose two techniques to reduce sensor discrepancy and improve pseudo label quality: 1) LiDAR beam subsampling, which simulates different LiDAR scanning patterns by randomly dropping beams; 2) cross-frame ensembling, which exploits temporal consistency of consecutive frames to generate more reliable pseudo labels. Our method is simple, generalizable, and does not incur any extra inference cost. We evaluate our method on several public LiDAR datasets and show that it outperforms the state-of-the-art methods by more than $3.9\%$ mIoU on average for all scenarios. Code will be available at https://github.com/JHLee0513/LiDARUDA.
- Self-supervised augmentation consistency for adapting semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15384–15394, 2021.
- SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences. In Proc. of the IEEE/CVF International Conf. on Computer Vision (ICCV), 2019.
- The lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4413–4421, 2018.
- Unsupervised domain adaptation for point cloud semantic segmentation via graph matching. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 9899–9904. IEEE, 2022.
- nuscenes: A multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027, 2019.
- A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709, 2020.
- Big self-supervised models are strong semi-supervised learners. arXiv preprint arXiv:2006.10029, 2020.
- (af)2-s3net: Attentive feature fusion with adaptive feature selection for sparse semantic segmentation network. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12542–12551, 2021.
- 4d spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3075–3084, 2019.
- Spconv Contributors. Spconv: Spatially sparse convolution library. https://github.com/traveller59/spconv, 2022.
- Salsanext: Fast, uncertainty-aware semantic segmentation of lidar point clouds for autonomous driving, 2020.
- Self-ensembling for visual domain adaptation. In International Conference on Learning Representations, 2018.
- Locomotion policy guided traversability learning using volumetric representations of complex environments. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, oct 2022.
- Tns: Terrain traversability mapping and navigation system for autonomous excavators. arXiv preprint arXiv:2109.06250, 2021.
- Cycada: Cycle-consistent adversarial domain adaptation. In International conference on machine learning, pages 1989–1998. Pmlr, 2018.
- Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11108–11117, 2020.
- Lidarnet: A boundary-aware domain adaptation model for point cloud semantic segmentation, 2020.
- Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
- Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.
- Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242, 2016.
- Domain transfer for semantic segmentation of lidar data using deep neural networks. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 8263–8270. IEEE, 2020.
- Sliced wasserstein discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10285–10295, 2019.
- Class-balanced pixel-level self-labeling for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11593–11603, 2022.
- Revisiting batch normalization for practical domain adaptation. International Conference on Learning Representations, 2017.
- Real-time semantic mapping for autonomous off-road navigation. In International Symposium on Field and Service Robotics, 2017.
- Instance adaptive self-training for unsupervised domain adaptation. In European Conference on Computer Vision, 2020.
- Rangenet++: Fast and accurate lidar semantic segmentation. In 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 4213–4220. IEEE, 2019.
- Minimal-entropy correlation alignment for unsupervised deep domain adaptation. arXiv preprint arXiv:1711.10288, 2017.
- Semanticposs: A point cloud dataset with large quantity of dynamic instances, 2020.
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
- Analyzing the cross-sensor portability of neural network architectures for lidar-based semantic labeling. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pages 3419–3426. IEEE, 2019.
- Dataset shift in machine learning. Mit Press, 2008.
- Cross-sensor deep domain adaptation for lidar detection and segmentation. In 2019 IEEE Intelligent Vehicles Symposium (IV), pages 1535–1542. IEEE, 2019.
- Semantic terrain classification for off-road autonomous driving. In 5th Annual Conference on Robot Learning, 2021.
- Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- Searching efficient 3d architectures with sparse point-voxel convolution. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVIII, pages 685–702. Springer, 2020.
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems, 30, 2017.
- Kpconv: Flexible and deformable convolution for point clouds. Proceedings of the IEEE International Conference on Computer Vision, 2019.
- A Survey on Deep Domain Adaptation for LiDAR Perception. In Proc. IEEE Intelligent Vehicles Symposium (IV) Workshops, 2021.
- Deep parametric continuous convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2589–2597, 2018.
- Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In ICRA, 2019.
- Adaptive adversarial network for source-free domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9010–9019, 2021.
- 2dpass: 2d priors assisted semantic segmentation on lidar point clouds. In European Conference on Computer Vision, pages 677–695. Springer, 2022.
- Complete & label: A domain adaptation approach to semantic segmentation of lidar point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15363–15373, June 2021.
- Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision, 129(4):1106–1120, 2021.
- Domain adaptive semantic segmentation via regional contrastive consistency regularization. 2022 IEEE International Conference on Multimedia and Expo (ICME), pages 01–06, 2021.
- Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. arXiv preprint arXiv:2011.10033, 2020.
- Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In European Conference on Computer Vision, 2018.
- Confidence regularized self-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5982–5991, 2019.