Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection (2209.01589v3)

Published 4 Sep 2022 in cs.CV

Abstract: In this study, we dive deep into the inconsistency of pseudo targets in semi-supervised object detection (SSOD). Our core observation is that the oscillating pseudo-targets undermine the training of an accurate detector. It injects noise into the student's training, leading to severe overfitting problems. Therefore, we propose a systematic solution, termed ConsistentTeacher, to reduce the inconsistency. First, adaptive anchor assignment~(ASA) substitutes the static IoU-based strategy, which enables the student network to be resistant to noisy pseudo-bounding boxes. Then we calibrate the subtask predictions by designing a 3D feature alignment module~(FAM-3D). It allows each classification feature to adaptively query the optimal feature vector for the regression task at arbitrary scales and locations. Lastly, a Gaussian Mixture Model (GMM) dynamically revises the score threshold of pseudo-bboxes, which stabilizes the number of ground truths at an early stage and remedies the unreliable supervision signal during training. ConsistentTeacher provides strong results on a large range of SSOD evaluations. It achieves 40.0 mAP with ResNet-50 backbone given only 10% of annotated MS-COCO data, which surpasses previous baselines using pseudo labels by around 3 mAP. When trained on fully annotated MS-COCO with additional unlabeled data, the performance further increases to 47.7 mAP. Our code is available at \url{https://github.com/Adamdad/ConsistentTeacher}.

PDF Abstract

Analysis of Consistent-Teacher in Semi-supervised Object Detection

The paper "Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection" addresses the challenge of pseudo-target inconsistencies in semi-supervised object detection (SSOD). Semi-supervised learning paradigms aim to leverage vast amounts of unlabeled data alongside a limited set of labeled data to train models effectively. In object detection, this involves generating pseudo labels for undetected objects in the unlabeled data. The paper identifies the oscillating nature of these pseudo labels as a significant impediment to the accuracy of object detectors and introduces the Consistent-Teacher framework to mitigate this issue.

Main Contributions

The paper presents the Consistent-Teacher framework, which comprises several novel strategies:

Adaptive Anchor Assignment (ASA): Traditional IoU-based anchor assignments are replaced with an adaptive approach informed by a cost-aware method, which aids in mitigating sensitivity to noise inherent in pseudo-bounding boxes. This adaptive sample assignment ensures that the student's network is more resilient against erroneous pseudo-bounding boxes, preventing overfitting and stabilizing the training process.
3D Feature Alignment Module (FAM-3D): This module aims to improve the calibration between classification and regression tasks, which are integral to object detection. By allowing classification features to adaptively select the optimal feature vector required for the regression task, the paper enhances the alignment between these subtasks, thereby reducing label drifting and increasing the consistency of optimization objectives.
Gaussian Mixture Model (GMM)-based Thresholding: The GMM approach dynamically adjusts the score threshold for pseudo-bounding boxes, which is crucial during the early stages of training. This adaptability provides stability and reduces erratic label shifts caused by static thresholding, allowing for more coherent training of the SSOD models.

Experimental Results

The paper provides empirical evidence of Consistent-Teacher’s efficacy through extensive evaluations on the MS-COCO dataset and PASCAL VOC datasets. The framework consistently outperforms other leading techniques across various data labeling ratios, particularly when limited labeled training data is available. For instance, it attains a notable 40.0 mAP when trained on just 10% of the annotated MS-COCO dataset with a ResNet-50 backbone, showing an improvement of approximately 3 mAP over previous baseline methods that utilize pseudo labels. Further evaluation on the COCO-addition benchmark with fully annotated datasets shows an increase in performance up to 47.7 mAP. These results underscore Consistent-Teacher’s robustness and its ability to serve as a new benchmark for semi-supervised object detection.

Practical and Theoretical Implications

The practical implications of this paper are significant for developing efficient object detection systems where extensive labeled data is often a bottleneck. By improving the consistency of pseudo-labels, the Consistent-Teacher facilitates more stable training and, therefore, enhances the performance of detectors in practical applications, including surveillance, autonomous vehicles, and robotics, where leveraging large-scale unlabeled data is viable.

Theoretically, the approach contributes by broadening the understanding of task alignment and thresholding strategies in machine learning frameworks. Consistent-Teacher shows that aligning learning tasks and dynamically tuning parameters according to model capacity can significantly enhance the training process.

Future Prospects

The advancement presented in this paper opens several avenues for future research. One potential direction involves extending adaptive assignment and feature alignment techniques to other machine learning domains beyond object detection. Another avenue is exploring novel ways to further minimize inconsistencies through end-to-end approaches without relying on pre-set heuristics. Integrating Consistent-Teacher’s functionality into cutting-edge detection architectures such as DETR could also yield beneficial insights and performance improvements.

In summary, the "Consistent-Teacher" framework represents a significant contribution to the field of semi-supervised object detection, with comprehensive results and insights that pave the way for continued advancements in utilizing both labeled and unlabeled data effectively.

PDF Markdown Bookmark Chat (Pro)

Authors (9)

Xinjiang Wang (32 papers)
Xingyi Yang (45 papers)
Shilong Zhang (32 papers)
Yijiang Li (36 papers)
Litong Feng (22 papers)
Shijie Fang (11 papers)
Chengqi Lyu (13 papers)
Kai Chen (512 papers)
Wayne Zhang (42 papers)

Citations (40)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - Adamdad/ConsistentTeacher: [CVPR2023 Highlight] Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection (289 stars)