Moment Matching for Multi-Source Domain Adaptation (1812.01754v4)

Published 4 Dec 2018 in cs.CV

Abstract: Conventional unsupervised domain adaptation (UDA) assumes that training data are sampled from a single domain. This neglects the more practical scenario where training data are collected from multiple sources, requiring multi-source domain adaptation. We make three major contributions towards addressing this problem. First, we collect and annotate by far the largest UDA dataset, called DomainNet, which contains six domains and about 0.6 million images distributed among 345 categories, addressing the gap in data availability for multi-source UDA research. Second, we propose a new deep learning approach, Moment Matching for Multi-Source Domain Adaptation M3SDA, which aims to transfer knowledge learned from multiple labeled source domains to an unlabeled target domain by dynamically aligning moments of their feature distributions. Third, we provide new theoretical insights specifically for moment matching approaches in both single and multiple source domain adaptation. Extensive experiments are conducted to demonstrate the power of our new dataset in benchmarking state-of-the-art multi-source domain adaptation methods, as well as the advantage of our proposed model. Dataset and Code are available at \url{http://ai.bu.edu/M3SDA/}.

Authors (6)

Xingchao Peng (15 papers)
Qinxun Bai (15 papers)
Xide Xia (13 papers)
Zijun Huang (3 papers)
Kate Saenko (178 papers)
Bo Wang (823 papers)

Citations (1,606)

View on Semantic Scholar

Summary

Moment Matching for Multi-Source Domain Adaptation

The paper "Moment Matching for Multi-Source Domain Adaptation" addresses a key challenge in unsupervised domain adaptation (UDA) by moving beyond the conventional single-source assumption and focusing on scenarios where training data are collected from multiple distinct domains. This practical shift acknowledges the complexity of real-world applications, where labeled images often come from diverse sources, such as different lighting conditions, visual cues, or modalities. The authors present three significant contributions to tackle this problem.

Firstly, the paper introduces DomainNet, the largest UDA dataset to date, containing approximately 0.6 million images across 345 categories from six distinct domains. This dataset provides a much-needed resource to mitigate the limitations of existing small-scale UDA datasets, which often lead to performance saturation in current models.

Secondly, the authors propose a novel approach named Moment Matching for Multi-Source Domain Adaptation (M3SDA), designed to transfer knowledge from multiple labeled source domains to an unlabeled target domain by aligning the moments of their feature distributions dynamically. This approach contrasts with previous methods that typically focused solely on aligning feature distributions between a single source and a target domain.

The third contribution lies in providing new theoretical insights into moment matching approaches for both single and multiple source domain adaptation. This theoretical framework underpins the empirical success of the M3SDA model and offers a deeper understanding of why moment matching is effective in aligning feature distributions across multiple domains.

Theoretical Insights

The proposed moment matching approach operates by minimizing the Moment Distance (MD) between the source and target domains, defined as:

$MD^2(D_S, D_T) = \sum_{k=1}^{2}\Bigg( \frac{1}{N}\sum_{i=1}^{N} \|E(D_i^k) - E(D_T^k) \|_2 + \binom{N}{2}^{-1} \sum_{i=1}^{N-1} \sum_{j=i+1}^{N} \|E(D_i^k) - E(D_j^k) \|_2 \Bigg)$

Here, the model aligns moments of two orders (means and covariances) of the feature distributions across domains.

The paper also extends the theoretical groundwork for moment-based UDA by introducing a bound dependent on cross-moment divergence between source and target domains. This bound offers a rigorous motivation for the proposed method and aligns well with empirical observations, demonstrating the necessity of aligning sources with each other to ensure robust domain adaptation.

Dataset Creation and Analysis

To construct DomainNet, the authors collected and manually annotated images from six different domains, ensuring a broad variety of categories and visual styles. The dataset's scale and diversity make it a valuable benchmark for evaluating multi-source adaptation methods. Notably, the dataset covers domains such as Clipart, Infograph, Painting, Quickdraw, Real, and Sketch, each contributing unique characteristics that model the complexity of real-world data.

Empirical Evaluation

The paper conducts extensive experiments, benchmarking several state-of-the-art UDA methods on DomainNet and other datasets. The key findings indicate that:

Conventional single-source UDA methods perform suboptimally in multi-source scenarios.
The M3SDA model outperforms existing methods, including domain adversarial and discrepancy-based approaches, in multi-source settings.
Aligning moments across both source-source and source-target pairs is crucial for improving model performance, as demonstrated by ablation studies.

Practical and Theoretical Implications

The contributions of this paper have significant practical and theoretical implications. Practically, the introduction of DomainNet provides a robust benchmark for future research in multi-source domain adaptation. The M3SDA model, with its dynamic moment alignment, sets a new bar for performance, applicable to various AI systems that must handle data from multiple sources.

Theoretically, the insights into moment matching extend beyond single-source UDA, encouraging the community to explore divergence-based measures in multi-source contexts. This theoretical framework can guide the development of more sophisticated models that explicitly account for interactions between multiple domains.

Future Directions

Given the promising results, future research could focus on further refining moment matching techniques and exploring higher-order moments for more nuanced domain alignment. Extending the theoretical analysis to include other divergence metrics and their empirical validation could also yield new insights. Additionally, leveraging advanced neural architectures to enhance the feature extractor component of M3SDA may further improve its adaptability and accuracy.

In summary, the paper presents a comprehensive approach to multi-source domain adaptation through dynamic moment matching, supported by both theoretical analysis and extensive empirical validation. The authors' contributions ensure a significant step forward in the field of UDA, providing a robust foundation for future research and practical applications in AI.

PDF Markdown