Semi-Supervised Class-Agnostic Motion Prediction with Pseudo Label Regeneration and BEVMix (2312.08009v2)
Abstract: Class-agnostic motion prediction methods aim to comprehend motion within open-world scenarios, holding significance for autonomous driving systems. However, training a high-performance model in a fully-supervised manner always requires substantial amounts of manually annotated data, which can be both expensive and time-consuming to obtain. To address this challenge, our study explores the potential of semi-supervised learning (SSL) for class-agnostic motion prediction. Our SSL framework adopts a consistency-based self-training paradigm, enabling the model to learn from unlabeled data by generating pseudo labels through test-time inference. To improve the quality of pseudo labels, we propose a novel motion selection and re-generation module. This module effectively selects reliable pseudo labels and re-generates unreliable ones. Furthermore, we propose two data augmentation strategies: temporal sampling and BEVMix. These strategies facilitate consistency regularization in SSL. Experiments conducted on nuScenes demonstrate that our SSL method can surpass the self-supervised approach by a large margin by utilizing only a tiny fraction of labeled data. Furthermore, our method exhibits comparable performance to weakly and some fully supervised methods. These results highlight the ability of our method to strike a favorable balance between annotation costs and performance. Code will be available at https://github.com/kwwcv/SSMP.
- Pseudo-labeling and confirmation bias in deep semi-supervised learning. In International Joint Conference on Neural Networks (IJCNN), 1–8. IEEE.
- Learning with pseudo-ensembles. In Neural Information Processing Systems (NeurIPS). Curran Associates, Inc.
- Pointflownet: Learning representations for rigid motion estimation from point clouds. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. In International Conference on Learning Representations (ICLR).
- Mixmatch: A holistic approach to semi-supervised learning. In Neural Information Processing Systems (NeurIPS). Curran Associates, Inc.
- nuScenes: A Multimodal Dataset for Autonomous Driving. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction. arXiv:1910.05449.
- Argoverse: 3d tracking and forecasting with rich maps. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Pointmixup: Augmentation for point clouds. In European Conference on Computer Vision (ECCV), 330–345. Springer.
- Cuturi, M. 2013. Sinkhorn distances: Lightspeed computation of optimal transport. In Neural Information Processing Systems (NeurIPS). Curran Associates, Inc.
- Uncertainty-aware short-term motion prediction of traffic actors for autonomous driving. In IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE.
- Tpnet: Trajectory proposal network for motion prediction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Adam: A Method for Stochastic Optimization. arXiv:1412.6980.
- Lasermix for semi-supervised lidar semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21705–21715. IEEE.
- Temporal ensembling for semi-supervised learning. In International Conference on Learning Representations (ICLR).
- Pointpillars: Fast encoders for object detection from point clouds. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Lee, D.-H.; et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, volume 3.
- Patchwork++: Fast and Robust Ground Segmentation Solving Partial Under-Segmentation Using 3D Point Cloud. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
- Weakly Supervised Class-Agnostic Motion Prediction for Autonomous Driving. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 17599–17608. IEEE.
- Pnpnet: End-to-end perception and prediction with tracking in the loop. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- FlowNet3D: Learning Scene Flow in 3D Point Clouds. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Self-Supervised Pillar Motion Learning for Autonomous Driving. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Mix3d: Out-of-context data augmentation for 3d scenes. In 2021 International Conference on 3D Vision (3DV), 116–125. IEEE.
- Pytorch: An imperative style, high-performance deep learning library. In Neural Information Processing Systems (NeurIPS). Curran Associates, Inc.
- Long-term occupancy grid prediction using recurrent neural networks. In International Conference on Robotics and Automation (ICRA).
- Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In Neural Information Processing Systems (NeurIPS). Curran Associates, Inc.
- Fixmatch: Simplifying semi-supervised learning with consistency and confidence. In Neural Information Processing Systems (NeurIPS). Curran Associates, Inc.
- A Simple Semi-Supervised Learning Framework for Object Detection. arXiv:2005.04757.
- Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Neural Information Processing Systems (NeurIPS). Curran Associates, Inc.
- Hierarchical attention learning of scene flow in 3d point clouds. IEEE Transactions on Image Processing (TIP), 30: 5168–5181.
- BE-STI: Spatial-Temporal Integrated Network for Class-agnostic Motion Prediction with Bidirectional Enhancement. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Spatiotemporal Transformer Attention Network for 3D Voxel Level Joint Segmentation and Motion Prediction in Point Cloud. In IEEE Intelligent Vehicles Symposium (IV). IEEE.
- MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Unsupervised data augmentation for consistency training. In Neural Information Processing Systems (NeurIPS). Curran Associates, Inc.
- End-to-end semi-supervised object detection with soft teacher. In IEEE Conference on International Conference on Computer Vision (ICCV). IEEE.
- Cutmix: Regularization strategy to train strong classifiers with localizable features. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- mixup: Beyond Empirical Risk Minimization. arXiv:1710.09412.
- Pointcutmix: Regularization strategy for point cloud classification. Neurocomputing, 505: 58–67.
- Instant-teaching: An end-to-end semi-supervised object detection framework. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.
- Kewei Wang (15 papers)
- Yizheng Wu (36 papers)
- Zhiyu Pan (24 papers)
- Xingyi Li (14 papers)
- Ke Xian (26 papers)
- Zhe Wang (574 papers)
- Zhiguo Cao (88 papers)
- Guosheng Lin (157 papers)