Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation (2111.12903v3)

Published 25 Nov 2021 in cs.CV

Abstract: Consistency learning using input image, feature, or network perturbations has shown remarkable results in semi-supervised semantic segmentation, but this approach can be seriously affected by inaccurate predictions of unlabelled training images. There are two consequences of these inaccurate predictions: 1) the training based on the "strict" cross-entropy (CE) loss can easily overfit prediction mistakes, leading to confirmation bias; and 2) the perturbations applied to these inaccurate predictions will use potentially erroneous predictions as training signals, degrading consistency learning. In this paper, we address the prediction accuracy problem of consistency learning methods with novel extensions of the mean-teacher (MT) model, which include a new auxiliary teacher, and the replacement of MT's mean square error (MSE) by a stricter confidence-weighted cross-entropy (Conf-CE) loss. The accurate prediction by this model allows us to use a challenging combination of network, input data and feature perturbations to improve the consistency learning generalisation, where the feature perturbations consist of a new adversarial perturbation. Results on public benchmarks show that our approach achieves remarkable improvements over the previous SOTA methods in the field. Our code is available at https://github.com/yyliu01/PS-MT.

Authors (6)

Yuyuan Liu (26 papers)
Yu Tian (250 papers)
Yuanhong Chen (30 papers)
Fengbei Liu (24 papers)
Vasileios Belagiannis (58 papers)
Gustavo Carneiro (129 papers)

Citations (160)

View on Semantic Scholar

Summary

Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation

The paper "Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation" presents an advanced technique for semi-supervised learning in the domain of semantic segmentation. The research addresses the inherent challenges associated with leveraging unlabelled data, primarily focusing on mitigating the negative impact of incorrect predictions during training. This is especially pertinent in consistency learning methods where prediction errors can lead to confirmation bias or degrade the quality of learning signals.

Methodology

The authors propose several novel extensions to the existing Mean Teacher (MT) framework to enhance prediction accuracy. Key innovations include:

Auxiliary Teacher and Conf-CE Loss: The introduction of an auxiliary teacher and replacing the conventional MT's mean square error (MSE) loss with a confidence-weighted cross-entropy (Conf-CE) loss. This adjustment aims to improve convergence and accuracy by effectively handling prediction mistakes.
Perturbation Strategy: The enhanced MT framework employs a combination of network, feature, and image perturbations to bolster generalisation capabilities. Feature perturbations integrate adversarial perturbation elements, enhancing robustness against distribution shifts and improving model generalisation.

Experimental Results

Empirical evidence indicates that the proposed model outperforms existing state-of-the-art methods across established benchmarks such as Pascal VOC 2012 and Cityscapes. The results demonstrate substantial performance improvements with a margin of up to 5% in certain instances. These enhancements underline the efficacy of the auxiliary teacher and Conf-CE loss in cultivating a more reliable signal from unlabelled data.

Implications and Future Directions

Practically, the research provides a robust framework for deploying semi-supervised semantic segmentation solutions, particularly where labelled data is scant. Theoretically, it poses significant advancements in understanding consistency learning under semi-supervised settings. Future research may focus on expanding the perturbation strategies further and addressing the potential overfitting associated with using Confidence-weighted CE loss. Additionally, increasing the robustness of these methods for handling high-resolution imagery or more heterogeneous datasets would be beneficial.

In summary, this paper makes insightful contributions to the field of semi-supervised semantic segmentation by introducing highly effective techniques for improving the prediction accuracy and robustness of consistency learning methodologies.