Feature-Critic Networks for Heterogeneous Domain Generalization (1901.11448v3)

Published 31 Jan 2019 in cs.LG and stat.ML

Abstract: The well known domain shift issue causes model performance to degrade when deployed to a new target domain with different statistics to training. Domain adaptation techniques alleviate this, but need some instances from the target domain to drive adaptation. Domain generalisation is the recently topical problem of learning a model that generalises to unseen domains out of the box, and various approaches aim to train a domain-invariant feature extractor, typically by adding some manually designed losses. In this work, we propose a learning to learn approach, where the auxiliary loss that helps generalisation is itself learned. Beyond conventional domain generalisation, we consider a more challenging setting of heterogeneous domain generalisation, where the unseen domains do not share label space with the seen ones, and the goal is to train a feature representation that is useful off-the-shelf for novel data and novel categories. Experimental evaluation demonstrates that our method outperforms state-of-the-art solutions in both settings.

Authors (4)

Yiying Li (12 papers)
Yongxin Yang (73 papers)
Wei Zhou (311 papers)
Timothy M. Hospedales (69 papers)

Citations (245)

View on Semantic Scholar

Summary

Evaluation of "Feature-Critic Networks for Heterogeneous Domain Generalisation"

The paper "Feature-Critic Networks for Heterogeneous Domain Generalisation" addresses the significant challenge of domain shift in unsupervised domain adaptation (UDA) and domain generalisation (DG). While traditional methods focus on adapting models to a specific target domain using unlabeled data or assume consistent label spaces across domains, this work extends the problem into heterogeneous domain generalisation, where unseen target domains possess disjoint label spaces.

Core Contributions and Methodology

The authors propose a meta-learning framework that leverages a feature-critic network to improve the robustness of feature extractors in the face of domain shift. This novel approach is designed to create domain-invariant feature representations that are applicable "out-of-the-box" to unseen domains without fine-tuning. The paper extends conventional DG practices to address more complex scenarios where the target domain not only differs in data distribution but also in label categories.

Meta-Learning Strategy:
- The framework utilizes episodic training, inspired by techniques in few-shot learning, to simulate domain shifts between source and virtual testing domains. This design ensures that models trained on multiple source domains can generalize to novel, unseen domains.
Feature-Critic Network:
- The feature-critic network critiques the robustness of features concerning simulated domain shifts, providing an auxiliary loss that complements the standard classification loss. The goal is to train a feature extractor that can be applied across domains with varying data statistics and label spaces.
Evaluation on Benchmarks:
- This approach is evaluated on standard DG benchmarks, such as Rotated MNIST and PACS, as well as the more challenging Visual Decathlon (VD) benchmark, which offers a comprehensive and diverse set of domains.

Results and Implications

The paper's experimental results demonstrate competitive, if not superior, performance compared to state-of-the-art methods across both heterogeneous and homogeneous DG scenarios.

Visual Decathlon: The method shows substantial improvements over naive baseline methods (e.g., ImageNet pre-trained features) and other DG methods like CrossGrad and MetaReg. In particular, it delivers strong results across both SVM and KNN classifiers, highlighting its adaptability to novel target domains, especially under varying target data availability.
Rotated MNIST and PACS: The feature-critic approach also proves its effectiveness in more conventional DG settings by outperforming established techniques in these benchmarks. This suggests that the meta-learned feature extractor provides robust features that generalize well beyond the training domains.

Future Directions

The implications of this research are significant for areas requiring robust feature extraction without domain-specific tuning, such as automated classification systems in dynamic environments or limited data scenarios. Future research could explore extending the feature-critic concept to reinforcement learning (RL) and other areas requiring domain adaptation.

Additionally, while this paper focuses on image-based tasks, there exists potential for adapting these strategies to other domains like natural language processing or cross-modal tasks, where domain shifts are prevalent.

Overall, the paper contributes to the growing body of work in domain generalisation by offering a scalable solution to heterogeneous label spaces and domain shifts, charting a path for further advancements in model robustness across variable environments.

PDF Markdown