Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision (2106.01226v2)

Published 2 Jun 2021 in cs.CV

Abstract: In this paper, we study the semi-supervised semantic segmentation problem via exploring both labeled data and extra unlabeled data. We propose a novel consistency regularization approach, called cross pseudo supervision (CPS). Our approach imposes the consistency on two segmentation networks perturbed with different initialization for the same input image. The pseudo one-hot label map, output from one perturbed segmentation network, is used to supervise the other segmentation network with the standard cross-entropy loss, and vice versa. The CPS consistency has two roles: encourage high similarity between the predictions of two perturbed networks for the same input image, and expand training data by using the unlabeled data with pseudo labels. Experiment results show that our approach achieves the state-of-the-art semi-supervised segmentation performance on Cityscapes and PASCAL VOC 2012. Code is available at https://git.io/CPS.

Authors (4)

Xiaokang Chen (39 papers)
Yuhui Yuan (42 papers)
Gang Zeng (40 papers)
Jingdong Wang (236 papers)

Citations (675)

View on Semantic Scholar

Summary

Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

The research paper titled "Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision" presents an innovative methodology for tackling the problem of semi-supervised semantic segmentation—a critical area in computer vision that requires high precision due to its dependence on pixel-level annotations. These annotations, fundamentally more labor-intensive compared to tasks like object detection or image classification, create a practical challenge in acquiring sufficient labeled data.

Cross Pseudo Supervision Approach

The central contribution of the paper is the Cross Pseudo Supervision (CPS) method, which introduces a novel consistency regularization technique. This approach is characterized by the simultaneous utilization of labeled and unlabeled data, making it an effective strategy to leverage available datasets without exhaustive annotation requirements. CPS works by using two identical segmentation networks initialized differently. For a given input, each network generates a pseudo one-hot label map, which is subsequently used to guide the training of the alternative network through the cross-entropy loss. This technique enforces a dual consistency: it ensures the predictions of both networks for the same image remain similar, while also enlarging the training set via pseudo-labeling on unlabeled data.

Empirical Evidence

The performance of the proposed CPS approach is demonstrated through rigorous experimentation across widely-recognized benchmarks, namely Cityscapes and PASCAL VOC 2012. The results reveal that the CPS framework achieves state-of-the-art performance for semi-supervised semantic segmentation. Notably, the integration of the CutMix augmentation further enhances CPS, improving its effectiveness in scenarios with particularly sparse labeled data. For example, CPS yielded improvements of up to 4.22% in mIoU on Cityscapes in low-label-regime settings with ResNet-101 compared to baseline setups.

Theoretical and Practical Implications

This research carries significant implications for both theoretical advancements and practical applications. Theoretically, CPS opens new avenues for research in semi-supervised learning by showcasing how pseudo-labeling facilitated by network perturbation can enhance model learning. Practically, its applicability offers a substantial reduction in annotation efforts required for training segmentation networks, which is advantageous for industries relying on vast datasets where labeling is resource-intensive.

Future Developments

Given the promising results obtained with CPS, future developments may delve into further exploration of model architectures or the augmentation techniques employed during training. Another potential research trajectory could involve the analysis of different forms of regularization that incorporate CPS's consistency mechanism, thus extending the breadth of semi-supervised strategies.

In summary, the "Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision" paper makes a significant contribution to the domain of computer vision. By reducing the dependency on large labeled datasets while maintaining robust performance, CPS represents a stride towards more efficient and cost-effective segmentation systems, which is undoubtedly a valuable breakthrough for future AI system designs and applications.

PDF Markdown

Related Papers

Find Related Papers