Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Intra-domain Adaptation for Semantic Segmentation through Self-Supervision (2004.07703v4)

Published 16 Apr 2020 in cs.CV, cs.LG, and cs.RO

Abstract: Convolutional neural network-based approaches have achieved remarkable progress in semantic segmentation. However, these approaches heavily rely on annotated data which are labor intensive. To cope with this limitation, automatically annotated data generated from graphic engines are used to train segmentation models. However, the models trained from synthetic data are difficult to transfer to real images. To tackle this issue, previous works have considered directly adapting models from the source data to the unlabeled target data (to reduce the inter-domain gap). Nonetheless, these techniques do not consider the large distribution gap among the target data itself (intra-domain gap). In this work, we propose a two-step self-supervised domain adaptation approach to minimize the inter-domain and intra-domain gap together. First, we conduct the inter-domain adaptation of the model; from this adaptation, we separate the target domain into an easy and hard split using an entropy-based ranking function. Finally, to decrease the intra-domain gap, we propose to employ a self-supervised adaptation technique from the easy to the hard split. Experimental results on numerous benchmark datasets highlight the effectiveness of our method against existing state-of-the-art approaches. The source code is available at https://github.com/feipan664/IntraDA.git.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Fei Pan (31 papers)
  2. Inkyu Shin (19 papers)
  3. Francois Rameau (23 papers)
  4. Seokju Lee (20 papers)
  5. In So Kweon (156 papers)
Citations (351)

Summary

Unsupervised Intra-domain Adaptation for Semantic Segmentation through Self-Supervision

The paper "Unsupervised Intra-domain Adaptation for Semantic Segmentation through Self-Supervision" addresses a significant challenge in semantic segmentation, a crucial task in computer vision where the objective is to assign a semantic label to every pixel in an image. Traditional convolutional neural network (CNN)-based segmentation models have achieved formidable success. However, they are heavily dependent on large annotated datasets, which are costly and labor-intensive to obtain. This work proposes a novel two-step self-supervised approach to tackle domain adaptation specifically for semantic segmentation.

Overview

The core proposition of this paper is to minimize both inter-domain and intra-domain discrepancies through a self-supervised domain adaptation approach. The inter-domain gap pertains to differences between synthetic training data and real-world images, while the intra-domain gap concerns variability within the target domain itself. The authors introduce a methodology involving both inter-domain adaptation and a novel intra-domain adaptation mechanism.

  1. Inter-domain Adaptation: This step is based on prevalent unsupervised domain adaptation (UDA) techniques aimed at reducing the distribution shift between the source (synthetic) and target (real-world) domains. It employs entropy maps derived from pixel-wise predictions to assess the alignment between domain features.
  2. Entropy-based Ranking and Intra-domain Adaptation: Leveraging an entropy-based ranking mechanism, the target domain is segmented into easy and hard subdomains based on confidence levels of predictions. This partitioning allows the model to perform intra-domain adaptation by using pseudo labels generated from the easy split to improve predictions on the hard split.

Methodology

The proposed model comprises a generator-discriminator pair for both inter-domain and intra-domain adaptations. The methodology integrates:

  • Entropy Minimization: Entropy maps are utilized to rank the target images, a novel application that allows for the separation of the domain into easy and hard subdomains. A carefully selected hyperparameter, lambda, determines the proportion of the data in the easy subdomain.
  • Self-Supervised Learning: The segmentation predictions from the easy subdomain are treated as pseudo labels. The intra-domain adaptation leverages these pseudo labels to reduce the gap within the target domain by reclassifying images in the hard subdomain.

Experimental Results

The experimental evaluation of the proposed approach was conducted on benchmark datasets such as GTA5, SYNTHIA, and Synscapes as source domains, with Cityscapes as the target domain. The method outperformed several state-of-the-art models, including AdvEnt and AdaptSegNet, in terms of mean Intersection over Union (mIoU). Notably, enhancements were particularly marked in scenarios where image complexity introduced significant intra-domain variability.

In addition to semantic segmentation, the method was adapted for digit classification across domain shifts in datasets like MNIST, USPS, and SVHN, achieving superior performance compared to notable existing techniques, such as CyCADA.

Implications and Future Directions

The proposed intra-domain adaptation via self-supervision introduces a paradigm shift in handling domain discrepancies in semantic segmentation. The methodology not only bridges the traditional inter-domain gap but also addresses the heretofore neglected intra-domain variances. This advance implies potential for improving model robustness in varied real-world applications, such as autonomous driving and robotic vision.

Future research could explore more sophisticated ranking and separation mechanisms, leveraging advanced clustering methods or confidence metrics beyond entropy. Another promising avenue is the extension of this framework to other vision tasks such as object detection, where domain shifts are also prevalent. Additionally, integrating advanced generative models to simulate subdomain variability could further enhance model performance.

The presented work is a substantial addition to the domain adaptation literature, providing a basis for further investigations into domain-specific adaptation techniques that recognize the heterogeneity within target domains.