SeMoLi: What Moves Together Belongs Together (2402.19463v2)
Abstract: We tackle semi-supervised object detection based on motion cues. Recent results suggest that heuristic-based clustering methods in conjunction with object trackers can be used to pseudo-label instances of moving objects and use these as supervisory signals to train 3D object detectors in Lidar data without manual supervision. We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner. We leverage recent advances in scene flow estimation to obtain point trajectories from which we extract long-term, class-agnostic motion patterns. Revisiting correlation clustering in the context of message passing networks, we learn to group those motion patterns to cluster points to object instances. By estimating the full extent of the objects, we obtain per-scan 3D bounding boxes that we use to supervise a Lidar object detection network. Our method not only outperforms prior heuristic-based approaches (57.5 AP, +14 improvement over prior work), more importantly, we show we can pseudo-label and train object detectors across datasets.
- Rama: A rapid multicut algorithm on gpu. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8193–8202, 2022.
- Correlation clustering. Machine learning, 56:89–113, 2004.
- Laser-based segment classification using a mixture of bag-of-words. In Int. Conf. Intel. Rob. Sys., 2013.
- nuScenes: A multimodal dataset for autonomous driving. In CVPR, 2020a.
- nuScenes: A multimodal dataset for autonomous driving. In CVPR, 2020b.
- 4D spatio-temporal convnets: Minkowski convolutional neural networks. In CVPR, 2019.
- MMDetection3D Contributors. MMDetection3D: OpenMMLab next-generation platform for general 3D object detection. https://github.com/open-mmlab/mmdetection3d, 2020.
- Motion-based detection and tracking in 3d lidar scans. In Int. Conf. Rob. Automat., 2015.
- On the segmentation of 3d lidar point clouds. In Int. Conf. Rob. Automat., 2011.
- A density-based algorithm for discovering clusters in large spatial databases with noise. In Rob. Sci. Sys., 1996.
- Are we ready for autonomous driving? the KITTI vision benchmark suite. In CVPR, 2012.
- Neural message passing for quantum chemistry. 2017.
- Alignnet-3d: Fast point cloud registration of partially observed objects. In 3DV, 2019.
- Combining 3d shape, color, and motion for robust anytime tracking. In Rob. Sci. Sys., 2014.
- A probabilistic framework for real-time 3d segmentation using spatial, temporal, and semantic cues. In Rob. Sci. Sys., 2016.
- Learning to optimally segment point clouds. IEEE Rob. Automat. Letters, 5(2):875–882, 2020.
- Adam: A method for stochastic optimization. 2015.
- Panoptic segmentation. In CVPR, 2019.
- A clustering method for efficient segmentation of 3d laser data. In Int. Conf. Rob. Automat., 2008.
- Pointpillars: Fast encoders for object detection from point clouds. In CVPR, 2019a.
- Pointpillars: Fast encoders for object detection from point clouds. In CVPR, 2019b.
- Pointpillars: Fast encoders for object detection from point clouds. In CVPR, 2019c.
- Patchwork++: Fast and robust ground segmentation solving partial under-segmentation using 3d point cloud. In IROS, 2022.
- Fast neural scene flow. In ICCV, 2023.
- Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
- Group-free 3d object detection via transformers. 2021.
- HDBSCAN: Hierarchical density based clustering. J. Open Source Softw., 2(11):205, 2017.
- Joint self-localization and tracking of generic objects in 3d range data. In Int. Conf. Rob. Automat., 2013.
- Segmentation of 3d lidar data in non-flat urban environments using a local convexity criterion. In Intel. Veh. Symp., 2009.
- Motion inspired unsupervised perception and prediction in autonomous driving. In ECCV, 2022.
- Unsupervised class-agnostic instance segmentation of 3d lidar data for autonomous vehicles. IEEE Rob. Automat. Letters, 7(4):8713–8720, 2022.
- Large-scale object mining for object discovery from unlabeled video. In Int. Conf. Rob. Automat., 2019.
- Model based vehicle detection and tracking for autonomous urban driving. 26:123–139, 2009.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, 2017a.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. arXiv, 2017b.
- Frustum pointnets for 3d object detection from rgb-d data. In CVPR, 2017c.
- Deep hough voting for 3d object detection in point clouds. In CVPR, 2019.
- PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud. In CVPR, 2019.
- Scalability in perception for autonomous driving: Waymo open dataset. In CVPR, 2020.
- Tracking-based semi-supervised learning. Int. J. Rob. Research, 31(7):804–818, 2012.
- Towards 3D object recognition via classification of arbitrary object tracks. In Int. Conf. Rob. Automat., 2011.
- Toward autonomous driving: the cmu navlab. i. perception. IEEE expert, 6(4):31–42, 1991.
- Neural prior for trajectory estimation. In CVPR, 2022.
- What could move? Finding cars, pedestrians and bicyclists in 3D laser data. In Int. Conf. Rob. Automat., 2012.
- Towards universal object detection by domain attention. In CVPR, pages 7289–7298, 2019.
- Max Wertheimer. Untersuchungen zur lehre von der gestalt: I. prinzipielle bemerkungen. Psychologische forschung, 1(1):47–58, 1922.
- Argoverse 2: Next generation datasets for self-driving perception and forecasting. In Adv. Neural Inform. Process. Syst., 2021.
- Identifying unknown instances for autonomous driving. In Conference on Robot Learning, pages 384–393. PMLR, 2020.
- Second: Sparsely embedded convolutional detection. Sensors, 18(10):3337, 2018.
- Pixor: Real-time 3d object detection from point clouds. In CVPR, 2018.
- 3dssd: Point-based 3d single stage object detector. In CVPR, 2020.
- Center-based 3d object detection and tracking. In CVPR, 2021.
- Real-time multiple human perception with color-depth cameras on a mobile robot. IEEE Transactions on Cybernetics, 43(5):1429–1441, 2013.
- Towards unsupervised object detection from lidar point clouds. In CVPR, 2023.
- Voxelnet: End-to-end learning for point cloud based 3d object detection. In CVPR, 2018.