PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection (2203.16317v2)

Published 30 Mar 2022 in cs.CV and cs.AI

Abstract: In this paper, we delve into two key techniques in Semi-Supervised Object Detection (SSOD), namely pseudo labeling and consistency training. We observe that these two techniques currently neglect some important properties of object detection, hindering efficient learning on unlabeled data. Specifically, for pseudo labeling, existing works only focus on the classification score yet fail to guarantee the localization precision of pseudo boxes; For consistency training, the widely adopted random-resize training only considers the label-level consistency but misses the feature-level one, which also plays an important role in ensuring the scale invariance. To address the problems incurred by noisy pseudo boxes, we design Noisy Pseudo box Learning (NPL) that includes Prediction-guided Label Assignment (PLA) and Positive-proposal Consistency Voting (PCV). PLA relies on model predictions to assign labels and makes it robust to even coarse pseudo boxes; while PCV leverages the regression consistency of positive proposals to reflect the localization quality of pseudo boxes. Furthermore, in consistency training, we propose Multi-view Scale-invariant Learning (MSL) that includes mechanisms of both label- and feature-level consistency, where feature consistency is achieved by aligning shifted feature pyramids between two images with identical content but varied scales. On COCO benchmark, our method, termed PSEudo labeling and COnsistency training (PseCo), outperforms the SOTA (Soft Teacher) by 2.0, 1.8, 2.0 points under 1%, 5%, and 10% labelling ratios, respectively. It also significantly improves the learning efficiency for SSOD, e.g., PseCo halves the training time of the SOTA approach but achieves even better performance. Code is available at https://github.com/ligang-cs/PseCo.

PDF Abstract

Overview of PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection

The paper, titled "PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection," presents a sophisticated framework aimed at enhancing the Semi-Supervised Object Detection (SSOD) paradigm through the integration of pseudo labeling and consistency training. These techniques capitalize on unlabeled data to improve detection accuracy without incurring the high costs associated with fully-labeled datasets.

Key Contributions and Techniques

The authors identify critical limitations in existing SSOD frameworks, where the focus predominantly lies on classification scores, often neglecting the localization precision of pseudo boxes and feature-level consistency. To address these gaps, the paper introduces two principal methods: Noisy Pseudo Box Learning (NPL) and Multi-view Scale-invariant Learning (MSL).

Noisy Pseudo Box Learning (NPL):
- Prediction-guided Label Assignment (PLA): This strategy leverages model predictions rather than relying solely on Intersection-over-Union (IoU) thresholds. By doing so, it ensures robustness against inaccurately localized pseudo boxes.
- Positive-proposal Consistency Voting (PCV): PCV aims to quantify pseudo box localization quality via regression consistency among positive proposals, thereby weighting the regression losses according to localization quality.
Multi-view Scale-invariant Learning (MSL):
- This approach enhances consistency training by integrating both label-level and feature-level consistency. Unlike traditional methods that only target label consistency, MSL also aligns features using multi-scale views, strengthening scale invariance.

Experimental Validation

The efficacy of the proposed PseCo framework is demonstrated on the COCO benchmark. Notably, PseCo surpasses state-of-the-art methods, improving performance over the previous best method (Soft Teacher) by 2.0, 1.8, and 2.0 points under 1%, 5%, and 10% labeling ratios, respectively. Furthermore, PseCo halves the training time compared to competitive approaches, marking a significant advancement in terms of both performance efficacy and computational efficiency. For instance, in a setting with full COCO training data supplemented by additional 123K unlabeled images, PseCo achieves a mAP of 46.1%, indicating substantial improvements over existing benchmarks.

Theoretical and Practical Implications

From a theoretical perspective, the proposed framework effectively integrates object detection nuances into semi-supervised learning strategies, offering a nuanced understanding of how pseudo labels and consistency can be adapted for object detection tasks. Practically, PseCo showcases the potential of leveraging less annotated data while maintaining high performance, a critical consideration for large-scale deployment where annotation costs are prohibitive.

Future Directions

While PseCo represents a significant advancement, several avenues remain open for further exploration. Enhancements can be made in scaling the framework to other architectures or extending these principles to other domains within computer vision. Moreover, future research could delve further into the dynamic interaction between pseudo labeling and feature alignment to push the boundaries of semi-supervised learning capabilities.

In summation, the paper provides a robust framework for semi-supervised object detection, fundamentally challenging and extending traditional approaches to maximize the utility of unlabeled data through sophisticated and integrated learning techniques.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Gang Li (579 papers)
Xiang Li (1003 papers)
Yujie Wang (103 papers)
Yichao Wu (34 papers)
Ding Liang (39 papers)
Shanshan Zhang (36 papers)

Citations (73)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - ligang-cs/PseCo: An official implementation of the PseCo (ECCV2022) (133 stars)