Unsupervised Domain Adaptation in Semantic Segmentation: a Review (2005.10876v1)

Published 21 May 2020 in cs.CV, cs.LG, and eess.IV

Abstract: The aim of this paper is to give an overview of the recent advancements in the Unsupervised Domain Adaptation (UDA) of deep networks for semantic segmentation. This task is attracting a wide interest, since semantic segmentation models require a huge amount of labeled data and the lack of data fitting specific requirements is the main limitation in the deployment of these techniques. This problem has been recently explored and has rapidly grown with a large number of ad-hoc approaches. This motivates us to build a comprehensive overview of the proposed methodologies and to provide a clear categorization. In this paper, we start by introducing the problem, its formulation and the various scenarios that can be considered. Then, we introduce the different levels at which adaptation strategies may be applied: namely, at the input (image) level, at the internal features representation and at the output level. Furthermore, we present a detailed overview of the literature in the field, dividing previous methods based on the following (non mutually exclusive) categories: adversarial learning, generative-based, analysis of the classifier discrepancies, self-teaching, entropy minimization, curriculum learning and multi-task learning. Novel research directions are also briefly introduced to give a hint of interesting open problems in the field. Finally, a comparison of the performance of the various methods in the widely used autonomous driving scenario is presented.

Authors (4)

Marco Toldo (11 papers)
Andrea Maracani (10 papers)
Umberto Michieli (40 papers)
Pietro Zanuttigh (35 papers)

Citations (173)

View on Semantic Scholar

Summary

Unsupervised Domain Adaptation in Semantic Segmentation: A Review

The paper "Unsupervised Domain Adaptation in Semantic Segmentation: a Review" provides a comprehensive analysis of advancements in Unsupervised Domain Adaptation (UDA) for semantic segmentation. The authors, Marco Toldo, Andrea Maracani, Umberto Michieli, and Pietro Zanuttigh, focus on the increasing interest in UDA due to challenges associated with acquiring vast amounts of labeled data necessary for training robust semantic segmentation models. The paper aims to categorize, summarize, and evaluate the methodologies developed to address these challenges.

The overview begins by defining semantic segmentation, a task requiring pixel-level labeling of images, which poses a significant challenge due to the labor-intensive nature of generating extensive annotated datasets. UDA for semantic segmentation is presented as a means to leverage unlabeled data from a target domain to adapt models trained on a labeled source domain, despite statistical discrepancies between domains, a problem known as domain shift.

The authors categorize adaptation strategies into levels based on where the adaptation occurs in the process: input-level, feature-level, output-level, and specific network spaces. This framework aids in understanding where and how domain adaptation affects the network's perception of the input images.

Input-Level Adaptation: Techniques in this category often employ generative models to translate source images into the target domain style while maintaining semantic consistency. This includes methods such as CycleGAN tailored for semantic transfer by enforcing geometric or semantic constraints.
Feature-Level Adaptation: These methods aim at aligning the latent feature representations of the source and target domains. Adversarial training is commonly used here, incorporating domain discriminators to ensure domain invariant features while preserving task-specific information.
Output-Level Adaptation: This involves aligning the output space distributions, typically utilizing adversarial networks to ascertain indistinguishable segmentation outcomes between adapted domains.
Network-Specific Space Adaptation: Methods falling into this category entail strategic placement of adaptation modules at various network stages to achieve optimal domain bridging results.

The paper elaborates on several state-of-the-art methodologies within these categories, detailing approaches such as self-training, classifier discrepancy minimization, entropy minimization, curriculum learning, and multi-task learning. Each method is detailed with respect to its structure, objectives, and innovation in handling unsupervised data.

In particular, strong numerical results are highlighted. For example, when adapting from the GTA5 dataset to the Cityscapes dataset, some methods achieved notable mean Intersection over Union (mIoU) scores, a common metric for semantic segmentation accuracy. These quantitative indicators underscore the progress and effectiveness of different UDA techniques.

Practically, the research discussed ties UDA methods to real-world applications like autonomous driving, where rapid and cost-effective model adaptation to new driving environments is invaluable. Theoretically, the paper lays the groundwork for future exploration in category-specific adaptation and the handling of open-set scenarios where unseen classes may exist in target domains.

The authors suggest potential avenues for future research, emphasizing adaptations to scenarios where source domain classes differ from those in the target domain, and propose addressing continual learning challenges in semantic segmentation, where models must adapt incrementally over time.

In conclusion, this paper serves as an essential resource for understanding the current landscape of UDA in semantic segmentation. By systematically categorizing existing work and providing insights into their applicability and performance, it lays a foundation for further innovation and development in adaptive semantic understanding systems.

PDF Markdown

Related Papers

Find Related Papers