Domain Generalization Using a Mixture of Multiple Latent Domains (1911.07661v1)

Published 18 Nov 2019 in cs.CV and cs.LG

Abstract: When domains, which represent underlying data distributions, vary during training and testing processes, deep neural networks suffer a drop in their performance. Domain generalization allows improvements in the generalization performance for unseen target domains by using multiple source domains. Conventional methods assume that the domain to which each sample belongs is known in training. However, many datasets, such as those collected via web crawling, contain a mixture of multiple latent domains, in which the domain of each sample is unknown. This paper introduces domain generalization using a mixture of multiple latent domains as a novel and more realistic scenario, where we try to train a domain-generalized model without using domain labels. To address this scenario, we propose a method that iteratively divides samples into latent domains via clustering, and which trains the domain-invariant feature extractor shared among the divided latent domains via adversarial learning. We assume that the latent domain of images is reflected in their style, and thus, utilize style features for clustering. By using these features, our proposed method successfully discovers latent domains and achieves domain generalization even if the domain labels are not given. Experiments show that our proposed method can train a domain-generalized model without using domain labels. Moreover, it outperforms conventional domain generalization methods, including those that utilize domain labels.

Authors (2)

Toshihiko Matsuura (2 papers)
Tatsuya Harada (142 papers)

Citations (301)

View on Semantic Scholar

Summary

The paper introduces a novel approach that clusters samples via style features to partition latent domains, addressing real-world domain shifts.
It employs adversarial learning to develop a domain-invariant feature extractor without relying on manual domain annotations.
Experimental results on PACS and VLCS benchmarks show improved adaptability and reduced annotation costs compared to conventional methods.

Domain Generalization Using a Mixture of Multiple Latent Domains

In the domain of computer vision, the challenge of domain generalization has garnered significant interest, predominantly due to deep neural networks' (DNNs) susceptibility to domain shifts. Domain shifts occur when there is a discrepancy between the training (source) and testing (target) data distributions, which is commonplace in real-world applications such as autonomous driving, where varying environmental conditions can have drastic effects on model performance.

The paper by Toshihiko Matsuura and Tatsuya Harada presents a novel approach to domain generalization by introducing the concept of handling a mixture of multiple latent domains without the need for explicit domain labels. This paper moves away from the conventional assumption that the domain of each sample in a training dataset is known. Instead, it acknowledges the realistic scenario of datasets consisting of mixed latent domains, for which domain labels are unavailable, as is typical with datasets sourced through web crawling.

The authors propose a methodology that iteratively partitions samples into latent domains via clustering and subsequently trains a domain-invariant feature extractor using adversarial learning. The method draws on style features for clustering, based on the assumption that domain characteristics are often encapsulated in image styles. The domain-discriminative style features are computed via convolutional feature statistics, specifically mean and standard deviation, and these are employed to iteratively assign pseudo domain labels to properly train the feature extractor without manual domain annotations.

The results achieved from the experimental evaluation on benchmark datasets, PACS and VLCS, demonstrate the efficacy of the proposed method. Remarkably, it outperforms conventional domain generalization techniques which rely on domain labels, a bold claim indicating the potential of adversarial learning combined with pseudo-labeling through unsupervised clustering.

Key experimental insights highlighted in the paper include the robustness of the proposed method against variations in the number of pseudo-domain clusters and an observed improvement over baseline methods not only in pictorial datasets like PACS but also in more challenging VLCS datasets, where domain variation exists within photographic images. This robustness is reflected in the consistency of results even when the pseudo domains did not match the original domains in number, indicating the method's adaptability.

From a theoretical perspective, the paper suggests that focusing on image styles as proxies for domain-defining features could be a generalizable strategy across various data types and tasks. Practically, the ability to train models in the absence of explicit domain labels offers a significant reduction in the resource-intensive process of annotating domain information, thus broadening the applicability of domain generalization methods.

Looking forward, this research opens avenues in exploring how other latent factors beyond style could contribute to domain definition and how these can be efficiently exploited in training more robust models. Additionally, further advancements could include integrating this approach within broader AI systems that need to adapt quickly to changing inputs without predefined guidelines, such as those encountered in autonomous systems.

Overall, this paper contributes an insightful perspective on domain generalization, framing it within a realistic and resource-efficient context that promises heightened applicability for DNNs in dynamic, real-world environments.

PDF Markdown

Domain Generalization Using a Mixture of Multiple Latent Domains (1911.07661v1)

Summary

Domain Generalization Using a Mixture of Multiple Latent Domains

Related Papers