Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 73 tok/s

Gemini 2.5 Pro 39 tok/s Pro

GPT-5 Medium 27 tok/s Pro

GPT-5 High 19 tok/s Pro

GPT-4o 115 tok/s Pro

Kimi K2 226 tok/s Pro

GPT OSS 120B 461 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

Multi-source Domain Adaptation for Semantic Segmentation (1910.12181v1)

Published 27 Oct 2019 in cs.CV, cs.LG, and eess.IV

Abstract: Simulation-to-real domain adaptation for semantic segmentation has been actively studied for various applications such as autonomous driving. Existing methods mainly focus on a single-source setting, which cannot easily handle a more practical scenario of multiple sources with different distributions. In this paper, we propose to investigate multi-source domain adaptation for semantic segmentation. Specifically, we design a novel framework, termed Multi-source Adversarial Domain Aggregation Network (MADAN), which can be trained in an end-to-end manner. First, we generate an adapted domain for each source with dynamic semantic consistency while aligning at the pixel-level cycle-consistently towards the target. Second, we propose sub-domain aggregation discriminator and cross-domain cycle discriminator to make different adapted domains more closely aggregated. Finally, feature-level alignment is performed between the aggregated domain and target domain while training the segmentation network. Extensive experiments from synthetic GTA and SYNTHIA to real Cityscapes and BDDS datasets demonstrate that the proposed MADAN model outperforms state-of-the-art approaches. Our source code is released at: https://github.com/Luodian/MADAN.

Citations (150)

View on Semantic Scholar

Summary

The paper introduces MADAN, which leverages adversarial training and dynamic image generation to significantly reduce domain gaps in semantic segmentation.
It aggregates diverse source domains using sub-domain aggregation and cross-domain cycle discriminators to ensure consistent feature alignment.
MADAN achieves up to a 15.6% increase in mIoU on synthetic-to-real tasks, underscoring its potential impact on autonomous driving applications.

Multi-source Domain Adaptation for Semantic Segmentation

The paper "Multi-source Domain Adaptation for Semantic Segmentation" addresses the challenge of domain shifts in semantic segmentation tasks where labeled data is scarce in the target domain. This issue is particularly relevant in scenarios like autonomous driving, where deploying models trained on synthetic data to real-world environments is common. While traditional domain adaptation (DA) focuses on single-source scenarios, this work expands the scope to multi-source domain adaptation (MDA) by introducing the Multi-source Adversarial Domain Aggregation Network (MADAN), a framework designed to leverage multiple source domains to improve adaptation to a target domain.

Framework Overview

MADAN's architecture comprises three main components: Dynamic Adversarial Image Generation (DAIG), Adversarial Domain Aggregation (ADA), and Feature-aligned Semantic Segmentation (FSS). These components collaboratively work to align the domain distributions across multiple source datasets and the target domain. Notably, the framework employs both pixel-level and feature-level domain adaptation strategies, alongside novel aggregation techniques to handle the variances between different source domains effectively.

Dynamic Adversarial Image Generation (DAIG): This component utilizes Generative Adversarial Networks (GANs) to create domain-adapted images from source domains to the target domain. The proposed framework introduces a dynamic semantic consistency (DSC) loss that ensures that semantic content is preserved in image translation. This is a critical advancement over traditional single-source approaches, which typically neglect semantic preservation in multi-source settings.
Adversarial Domain Aggregation (ADA): ADA mitigates inter-source domain misalignments by employing sub-domain aggregation and cross-domain cycle discriminators. These discriminators work to aggregate adapted images from different source domains into a unified domain, thus reducing domain shifts and enhancing the consistency of feature representation.
Feature-aligned Semantic Segmentation (FSS): After obtaining a unified domain through ADA, FSS trains the segmentation network with an additional feature-level alignment between the aggregated domain and the target domain. This alignment ensures that the learned feature representations are robust to domain variations, thus improving the model's generalization capabilities.

Experimental Results

The efficacy of MADAN is demonstrated through comprehensive experiments on synthetic-to-real domain adaptation tasks, specifically from GTA and SYNTHIA to Cityscapes and BDDS datasets. The results reveal that MADAN consistently surpasses the performance of single-source and source-combined DA methods. It's noteworthy that MADAN achieves a significant improvement of up to 15.6% in mIoU when compared to single-source adaptations, cementing its prowess in leveraging multiple domains.

Implications and Future Directions

The introduction of MADAN marks a substantial step forward in the field of unsupervised domain adaptation for semantic segmentation. By addressing the limitations inherent in single-source approaches and proposing a multi-faceted adaptation strategy, MADAN lays the groundwork for future advancements in MDA. This can catalyze further research into leveraging MDA for other challenging tasks in computer vision and beyond, potentially extending into multi-modal scenarios where data from varying sensor modalities could be integrated.

Future developments might explore optimized architectures for real-time applications, particularly in computationally constrained environments such as autonomous vehicles. Additionally, improving the diversity within synthetic datasets and examining the interplay between different types of domain shifts remains an open area for exploration, with the potential to yield even greater transferability and robustness in practical, real-world scenarios.