Learning to Diversify for Single Domain Generalization (2108.11726v3)

Published 26 Aug 2021 in cs.CV

Abstract: Domain generalization (DG) aims to generalize a model trained on multiple source (i.e., training) domains to a distributionally different target (i.e., test) domain. In contrast to the conventional DG that strictly requires the availability of multiple source domains, this paper considers a more realistic yet challenging scenario, namely Single Domain Generalization (Single-DG), where only one source domain is available for training. In this scenario, the limited diversity may jeopardize the model generalization on unseen target domains. To tackle this problem, we propose a style-complement module to enhance the generalization power of the model by synthesizing images from diverse distributions that are complementary to the source ones. More specifically, we adopt a tractable upper bound of mutual information (MI) between the generated and source samples and perform a two-step optimization iteratively: (1) by minimizing the MI upper bound approximation for each sample pair, the generated images are forced to be diversified from the source samples; (2) subsequently, we maximize the MI between the samples from the same semantic category, which assists the network to learn discriminative features from diverse-styled images. Extensive experiments on three benchmark datasets demonstrate the superiority of our approach, which surpasses the state-of-the-art single-DG methods by up to 25.14%.

View on arXiv

Authors (5)

Zijian Wang (99 papers)
Yadan Luo (56 papers)
Ruihong Qiu (26 papers)
Zi Huang (126 papers)
Mahsa Baktashmotlagh (49 papers)

Citations (217)

View on Semantic Scholar

Summary

Learning to Diversify for Single Domain Generalization

The paper "Learning to Diversify for Single Domain Generalization" addresses the challenging problem of enhancing a model's generalization capability from a single source domain to multiple unseen target domains, a scenario known as Single Domain Generalization (Single-DG). Traditional domain generalization relies on multiple source domains to learn a robust representation that can generalize to new, unseen domains. However, the Single-DG setting allows only one available source domain, making the task considerably more challenging due to limited diversity.

Methodology Overview

The paper introduces an innovative approach termed as "Learning-to-Diversify" (L2D), which focuses on generating diversified training samples through a style-complement module. This module is designed to create samples with diverse styles that do not exist in the original source distribution, thereby enriching the training set. The approach includes a min-max mutual information (MI) optimization strategy:

MI Minimization: By minimizing a tractable upper bound of the MI between generated and source samples, the generated samples are pushed to diversify in the latent feature space.
MI Maximization: Simultaneously, the MI is maximized for samples within the same semantic category, enabling the learning of discriminative features that are style-invariant.

These two components work in an adversarial manner, leading to more robust and generalizable model outputs.

Numerical Results and Comparative Performance

The proposed approach was empirically validated across three benchmark datasets: Digits, Corrupted CIFAR-10, and PACS. On the Digits dataset, substantial improvements were achieved, with the method outperforming previous state-of-the-art Single-DG methods by up to 25.14%. For Corrupted CIFAR-10, significant gains were also reported, particularly under noise corruptions, where the method demonstrated resilience to severe corruptions with superior performance. On the PACS dataset, the approach showed strong performance in both single domain generalization and the standard leave-one-domain-out protocol.

Implications and Future Directions

Practically, this research has substantial implications for settings where only a single domain of data is available during training, but the application must generalize across various unseen environments. Theoretically, the introduction of the style-complement module and the mutual information optimization strategy adds a novel perspective to the domain generalization literature. Future developments may include further refinement of the style-complement module to handle even more complex variations in unseen domains, potentially extending to other modalities beyond images, such as text or audio data.

In conclusion, "Learning to Diversify for Single Domain Generalization" contributes a thoughtful approach to Single-DG by innovatively augmenting model training with diversified data samples. The strategic optimization framework employed showcases promising directions for increasing model robustness and generalization power in single domain constraints, paving the way for future exploration and application in broader contexts of AI research.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos