CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation (2109.06165v4)

Published 13 Sep 2021 in cs.CV and cs.LG

Abstract: Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a labeled source domain to a different unlabeled target domain. Most existing UDA methods focus on learning domain-invariant feature representation, either from the domain level or category level, using convolution neural networks (CNNs)-based frameworks. One fundamental problem for the category level based UDA is the production of pseudo labels for samples in target domain, which are usually too noisy for accurate domain alignment, inevitably compromising the UDA performance. With the success of Transformer in various tasks, we find that the cross-attention in Transformer is robust to the noisy input pairs for better feature alignment, thus in this paper Transformer is adopted for the challenging UDA task. Specifically, to generate accurate input pairs, we design a two-way center-aware labeling algorithm to produce pseudo labels for target samples. Along with the pseudo labels, a weight-sharing triple-branch transformer framework is proposed to apply self-attention and cross-attention for source/target feature learning and source-target domain alignment, respectively. Such design explicitly enforces the framework to learn discriminative domain-specific and domain-invariant representations simultaneously. The proposed method is dubbed CDTrans (cross-domain transformer), and it provides one of the first attempts to solve UDA tasks with a pure transformer solution. Experiments show that our proposed method achieves the best performance on public UDA datasets, e.g. VisDA-2017 and DomainNet. Code and models are available at https://github.com/CDTrans/CDTrans.

PDF Abstract

Cross-Domain Transformers for Unsupervised Domain Adaptation

The paper "CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation" introduces an innovative approach leveraging Transformer models for the task of Unsupervised Domain Adaptation (UDA). UDA is pivotal in machine learning for transferring knowledge from a labeled source domain to a different but related unlabeled target domain. A predominant challenge in UDA is aligning domain-invariant features amid domain shifts without adequate labeled data in the target domain.

Traditional UDA methodologies have heavily relied on convolutional neural networks (CNNs) for learning domain-invariant representations either at the domain level or the category level. A significant bottleneck for category-level UDA is the noise in pseudo labels generated for the unlabeled target domain, which impacts the overall performance due to inaccurate domain alignment.

The authors of the paper propose leveraging the robust cross-attention capabilities of Transformer models to address these challenges. The core contribution is CDTrans, a weight-sharing triple-branch transformer framework. This framework employs both self-attention and cross-attention modules to facilitate source/target feature learning and source-target domain alignment.

Key Contributions and Methodology

Transformer-Based Alignment: CDTrans is among the first attempts to employ a pure transformer framework for UDA. The model exploits the robustness of cross-attention in transformers, which naturally aligns features from disparate domains despite noisy input pairs.
Two-Way Center-Aware Pseudo Labeling: A novel labeling algorithm is proposed to generate pseudo labels, mitigating noise by considering the centroidal alignment of target samples. By weighting these centroidal alignments, the model reduces the impact of erroneous labels, thus enhancing the pseudo-label quality used for training. This involves generating a cross-domain similarity matrix and implementing a center-aware matching strategy to filter noise effectively.
Weight-Sharing Triple-Branch Framework: CDTrans integrates three branches where the source and target branches utilize self-attention for domain-specific learning, and a third branch utilizes cross-attention for source-target alignment. This design explicitly fosters concurrent learning of domain-specific and domain-invariant features.
Experimental Validation: The framework demonstrates superior performance across several public UDA datasets, including VisDA-2017 and DomainNet, outperforming existing state-of-the-art methods by a substantial margin. Experiments highlighted the efficacy of the proposed two-way center-aware pseudo labeling and the robustness of transformers in handling mislabeled data.

Implications and Future Directions

The introduction of transformers into UDA tasks signals a paradigm shift from CNN-dominated approaches, suggesting potentially richer modeling capabilities. The superior performance of CDTrans on benchmark datasets illustrates the viability of transformers in this domain. This approach provides a new direction for continued investigation into the fusion of transformer architectures with UDA and the optimization of cross-domain learning via robust feature alignment.

Looking forward, the framework paves the way for further exploration into improving pseudo-label generation techniques and the integration of transformers with other neural network architectures to enhance domain adaptation capabilities. Future work could address extending this methodology to even more complex, multi-modal domain adaptation tasks, leveraging the multi-head attention mechanism inherent in transformers to handle diverse data types concurrently.

In conclusion, the CDTrans framework successfully demonstrates the potential of transformers in unsupervised domain adaptation, offering a promising avenue for bridging domain discrepancies and enhancing the generalization ability of machine learning models across uncharted domains.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Tongkun Xu (1 paper)
Weihua Chen (35 papers)
Pichao Wang (65 papers)
Fan Wang (312 papers)
Hao Li (803 papers)
Rong Jin (164 papers)

Citations (193)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - CDTrans/CDTrans: [ICLR2022] CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation (318 stars)