BaSAL: Size-Balanced Warm Start Active Learning for LiDAR Semantic Segmentation (2310.08035v2)
Abstract: Active learning strives to reduce the need for costly data annotation, by repeatedly querying an annotator to label the most informative samples from a pool of unlabeled data, and then training a model from these samples. We identify two problems with existing active learning methods for LiDAR semantic segmentation. First, they overlook the severe class imbalance inherent in LiDAR semantic segmentation datasets. Second, to bootstrap the active learning loop when there is no labeled data available, they train their initial model from randomly selected data samples, leading to low performance. This situation is referred to as the cold start problem. To address these problems we propose BaSAL, a size-balanced warm start active learning model, based on the observation that each object class has a characteristic size. By sampling object clusters according to their size, we can thus create a size-balanced dataset that is also more class-balanced. Furthermore, in contrast to existing information measures like entropy or CoreSet, size-based sampling does not require a pretrained model, thus addressing the cold start problem effectively. Results show that we are able to improve the performance of the initial model by a large margin. Combining warm start and size-balanced sampling with established information measures, our approach achieves comparable performance to training on the entire SemanticKITTI dataset, despite using only 5% of the annotations, outperforming existing active learning methods. We also match the existing state-of-the-art in active learning on nuScenes. Our code is available at: https://github.com/Tony-WJR/BaSAL.
- J. Behley, M. Garbade, A. Milioto, J. Quenzel, S. Behnke, C. Stachniss, and J. Gall, “SemanticKITTI: A dataset for semantic scene understanding of lidar sequences,” in ICCV, 2019, pp. 9297–9307.
- H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuScenes: A multimodal dataset for autonomous driving,” in CVPR, 2020, pp. 11 621–11 631.
- W. K. Fong, R. Mohan, J. V. Hurtado, L. Zhou, H. Caesar, O. Beijbom, and A. Valada, “Panoptic nuscenes: A large-scale benchmark for lidar panoptic segmentation and tracking,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3795–3802, 2022.
- B. Settles, “Active learning literature survey,” University of Wisconsin-Madison Department of Computer Sciences, 2009.
- T.-H. Wu, Y.-C. Liu, Y.-K. Huang, H.-Y. Lee, H.-T. Su, P.-C. Huang, and W. H. Hsu, “Redal: Region-based and diversity-aware active learning for point cloud semantic segmentation,” in ICCV, 2021, pp. 15 510–15 519.
- Z. Hu, X. Bai, R. Zhang, X. Wang, G. Sun, H. Fu, and C.-L. Tai, “LiDAL: Inter-frame uncertainty based active learning for 3d lidar semantic segmentation,” in ECCV, 2022, pp. 248–265.
- V. Nath, D. Yang, H. R. Roth, and D. Xu, “Warm start active learning with proxy labels and selection via semi-supervised fine-tuning,” in MICCAI 2022, 2022.
- A. Holub, P. Perona, and M. C. Burl, “Entropy-based active learning for object recognition,” in CVPR Workshop. IEEE, 2008, pp. 1–8.
- O. Sener and S. Savarese, “Active learning for convolutional neural networks: A core-set approach,” arXiv preprint arXiv:1708.00489, 2017.
- W. H. Beluch, T. Genewein, A. Nürnberger, and J. M. Köhler, “The power of ensembles for active learning in image classification,” in CVPR, 2018, pp. 9368–9377.
- B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” NeurIPS, vol. 30, 2017.
- K. S. Tan, H. Caesar, and O. O. Beijbom, “Cross-modality active learning for object detection,” U.S. Patent 11521010, Jan. 2024.
- Y. Gal, R. Islam, and Z. Ghahramani, “Deep bayesian active learning with image data,” in International conference on machine learning. PMLR, 2017, pp. 1183–1192.
- D. Yoo and I. S. Kweon, “Learning loss for active learning,” in CVPR, 2019, pp. 93–102.
- S. Sinha, S. Ebrahimi, and T. Darrell, “Variational adversarial active learning,” in ICCV, 2019, pp. 5972–5981.
- S. Tong and D. Koller, “Support vector machine active learning with applications to text classification,” Journal of machine learning research, vol. 2, no. Nov, pp. 45–66, 2001.
- H. T. Nguyen and A. Smeulders, “Active learning using pre-clustering,” in Proceedings of the twenty-first international conference on Machine learning, 2004, p. 79.
- Y. Guo, “Active instance sampling via matrix partition,” NeurIPS, vol. 23, 2010.
- D. Gudovskiy, A. Hodgkinson, T. Yamaguchi, and S. Tsukizawa, “Deep active learning for biased datasets via fisher kernel self-supervision,” in CVPR, 2020, pp. 9041–9049.
- A. Kirsch, J. Van Amersfoort, and Y. Gal, “Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning,” NeurIPS, vol. 32, 2019.
- J. T. Ash, C. Zhang, A. Krishnamurthy, J. Langford, and A. Agarwal, “Deep batch active learning by diverse, uncertain gradient lower bounds,” arXiv preprint arXiv:1906.03671, 2019.
- A. Ghita, B. Antoniussen, W. Zimmer, R. Greer, C. Creß, A. Møgelmose, M. M. Trivedi, and A. C. Knoll, “ActiveAnno3D–an active learning framework for multi-modal 3d object detection,” arXiv preprint arXiv:2402.03235, 2024.
- H. Liang, C. Jiang, D. Feng, X. Chen, H. Xu, X. Liang, W. Zhang, Z. Li, and L. Van Gool, “Exploring geometry-aware contrast and clustering harmonization for self-supervised 3d object detection,” in ICCV, 2021, pp. 3293–3302.
- S. Segal, N. Kumar, S. Casas, W. Zeng, M. Ren, J. Wang, and R. Urtasun, “Just label what you need: fine-grained active selection for perception and prediction through partially labeled scenes,” arXiv preprint arXiv:2104.03956, 2021.
- J. Attenberg and Ş. Ertekin, “Class imbalance and active learning,” Imbalanced Learning: Foundations, Algorithms, and Applications, pp. 101–149, 2013.
- K. Tomanek and U. Hahn, “Reducing class imbalance during active learning for named entity annotation,” in Proceedings of the fifth international conference on Knowledge capture, 2009, p. 105–112.
- B. Settles and M. Craven, “An analysis of active learning strategies for sequence labeling tasks,” in proceedings of the 2008 conference on empirical methods in natural language processing, 2008, p. 1070–1079.
- P. Donmez and J. G. Carbonell, “Paired-sampling in density-sensitive active learning,” Carnegie Mellon University, 2008.
- K. Tomanek and U. Hahn, “Reducing class imbalance during active learning for named entity annotation,” in Proceedings of the fifth international conference on Knowledge capture, 2009, pp. 105–112.
- S. Ertekin, J. Huang, L. Bottou, and L. Giles, “Learning on the border: active learning in imbalanced data classification,” in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, 2007, pp. 127–136.
- M. Bloodgood and K. Vijay-Shanker, “Taking into account the differences between actively and passively acquired data: The case of active learning with support vector machines for imbalanced datasets,” arXiv preprint arXiv:1409.4835, 2014.
- U. Aggarwal, A. Popescu, and C. Hudelot, “Active learning for imbalanced datasets,” in WACV, 2020, pp. 1428–1437.
- M. Yuan, H.-T. Lin, and J. Boyd-Graber, “Cold-start active learning through self-supervised language modeling,” arXiv preprint arXiv:2010.09535, 2020.
- K. Pourahmadi, P. Nooralinejad, and H. Pirsiavash, “A simple baseline for low-budget active learning,” arXiv preprint arXiv:2110.12033, 2021.
- L. Chen, Y. Bai, S. Huang, Y. Lu, B. Wen, A. Yuille, and Z. Zhou, “Making your first choice: to address cold start problem in medical active learning,” in Medical Imaging with Deep Learning. PMLR, 2024, pp. 496–525.
- G. Hacohen, A. Dekel, and D. Weinshall, “Active learning on a budget: Opposite strategies suit high and low budgets,” arXiv preprint arXiv:2202.02794, 2022.
- O. Yehuda, A. Dekel, G. Hacohen, and D. Weinshall, “Active learning through a covering lens,” NeurIPS, vol. 35, pp. 22 354–22 367, 2022.
- L. McInnes, J. Healy, and S. Astels, “HDBScan: Hierarchical density based clustering,” The Journal of Open Source Software, vol. 2, no. 11, p. 205, 2017.
- H. Tang, Z. Liu, S. Zhao, Y. Lin, J. Lin, H. Wang, and S. Han, “Searching efficient 3d architectures with sparse point-voxel convolution,” in European conference on computer vision, 2020, pp. 685–702.
- C. Choy, J. Gwak, and S. Savarese, “4d spatio-temporal convnets: Minkowski convolutional neural networks,” in CVPR, 2019, pp. 3075–3084.
- A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in 2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012, pp. 3354–3361.
- D. Wang and Y. Shang, “A new active labeling method for deep learning,” in 2014 International joint conference on neural networks (IJCNN). IEEE, 2014, pp. 112–119.
- Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in international conference on machine learning. PMLR, 2016, pp. 1050–1059.
- Y. Lin, G. Vosselman, Y. Cao, and M. Yang, “Efficient training of semantic point cloud segmentation via active learning,” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 2, pp. 243–250, 2020.
- S. Lee, H. Lim, and H. Myung, “Patchwork++: Fast and robust ground segmentation solving partial under-segmentation using 3D point cloud,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2022, pp. 13 276–13 283.
- Z. Liang, X. Xu, S. Deng, L. Cai, T. Jiang, and K. Jia, “Exploring diversity-based active learning for 3d object detection in autonomous driving,” arXiv preprint arXiv:2205.07708, 2022.
- Z. Liu, X. Qi, and C.-W. Fu, “One thing one click: A self-training approach for weakly supervised 3d semantic segmentation,” in CVPR, 2021, pp. 1726–1736.
- M. Liu, Y. Zhou, C. R. Qi, B. Gong, H. Su, and D. Anguelov, “Less: Label-efficient semantic segmentation for lidar point clouds,” in ECCV, 2022, pp. 70–89.
- J. Papon, A. Abramov, M. Schoeler, and F. Worgotter, “Voxel cloud connectivity segmentation-supervoxels for point clouds,” in CVPR, 2013, pp. 2027–2034.