Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation (1910.13049v2)

Published 29 Oct 2019 in cs.CV

Abstract: Unsupervised domain adaptation (UDA) aims to enhance the generalization capability of a certain model from a source domain to a target domain. UDA is of particular significance since no extra effort is devoted to annotating target domain samples. However, the different data distributions in the two domains, or \emph{domain shift/discrepancy}, inevitably compromise the UDA performance. Although there has been a progress in matching the marginal distributions between two domains, the classifier favors the source domain features and makes incorrect predictions on the target domain due to category-agnostic feature alignment. In this paper, we propose a novel category anchor-guided (CAG) UDA model for semantic segmentation, which explicitly enforces category-aware feature alignment to learn shared discriminative features and classifiers simultaneously. First, the category-wise centroids of the source domain features are used as guided anchors to identify the active features in the target domain and also assign them pseudo-labels. Then, we leverage an anchor-based pixel-level distance loss and a discriminative loss to drive the intra-category features closer and the inter-category features further apart, respectively. Finally, we devise a stagewise training mechanism to reduce the error accumulation and adapt the proposed model progressively. Experiments on both the GTA5$\rightarrow $Cityscapes and SYNTHIA$\rightarrow $Cityscapes scenarios demonstrate the superiority of our CAG-UDA model over the state-of-the-art methods. The code is available at \url{https://github.com/RogerZhangzz/CAG_UDA}.

PDF Abstract

Overview of Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation

The paper "Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation" by Qiming Zhang et al. explores an innovative approach to the semantic segmentation problem in computer vision, focusing on enhancing generalization capability across different image domains without relying on labeled data from the target domain. This is achieved through a method termed as Category Anchor-Guided (CAG) Unsupervised Domain Adaptation (UDA), which aims to address the domain shift issue that occurs when models are trained on one domain (source) but tested on another (target).

Methodological Contributions

The central premise of the paper revolves around using "category anchors" to guide unsupervised domain adaptation processes. These category anchors serve as base points for aligning features across domains, specifically targeting the inadequacies of purely category-agnostic alignment methods. The CAG-UDA method emphasizes:

Category Anchor Construction (CAC): The method systematically computes centroids of source domain features for each category to serve as anchors. These centroids act as reference points during adaptation, facilitating the identification and alignment of corresponding features in the target domain.
Active Target Sample Identification (ATI): Through a process of calculating distances from the category anchors, this method identifies and isolates active samples in the target domain that are crucial for alignment and minimizes errors.
Pseudo-Label Assignment (PLA): For target domain samples identified as active, the method assigns pseudo-labels based on their proximity to category anchors. This procedure is decoupled from the classifier that may initially be biased towards the source domain data, offering a more stabilized form of supervision.

These components collectively advance domain adaptation by fostering an explicit and structured category-aligned feature learning process.

Empirical Validation and Results

The empirical results presented are based on multiple benchmark datasets, including GTA5 to Cityscapes and SYNTHIA to Cityscapes, showcasing the effectiveness of CAG-UDA. The model achieves superior performance with a substantial improvement in mean Intersection over Union (mIoU) compared to state-of-the-art methods, with notable gains in handling small object categories across challenging scenarios.

Significance and Implications

The paper's approach addresses issues like error accumulation stemming from incorrect pseudo-labels and category imbalance, using novel loss functions and a stagewise training mechanism. These developments may inspire broader applications in autonomous driving, video surveillance, and beyond, where accurate pixel-wise segmentation is crucial under domain shift circumstances.

By embedding category awareness through anchors, this model bridges the gap between class distribution in source and target domains, making strides toward robust and less error-prone model adaptation strategies in semantic segmentation.

Future Directions

Looking ahead, the proposed model's dependency on reliable pseudo-labels via a warm-up strategy highlights potential areas for improvement, particularly in creating an end-to-end category-aligned adaptation without pre-training necessities. Integration with emerging techniques like style transfer could be further explored to augment category-based feature alignments and pseudo-label reliability.

This work significantly contributes to the domain adaptation landscape by demonstrating innovative mechanisms for category-level feature alignment, setting the stage for future exploration in both theoretical development and practical applications within semantic segmentation tasks in AI.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Qiming Zhang (31 papers)
Jing Zhang (730 papers)
Wei Liu (1135 papers)
Dacheng Tao (826 papers)

Citations (272)

View on Semantic Scholar