Deeper, Broader and Artier Domain Generalization (1710.03077v1)

Published 9 Oct 2017 in cs.CV

Abstract: The problem of domain generalization is to learn from multiple training domains, and extract a domain-agnostic model that can then be applied to an unseen domain. Domain generalization (DG) has a clear motivation in contexts where there are target domains with distinct characteristics, yet sparse data for training. For example recognition in sketch images, which are distinctly more abstract and rarer than photos. Nevertheless, DG methods have primarily been evaluated on photo-only benchmarks focusing on alleviating the dataset bias where both problems of domain distinctiveness and data sparsity can be minimal. We argue that these benchmarks are overly straightforward, and show that simple deep learning baselines perform surprisingly well on them. In this paper, we make two main contributions: Firstly, we build upon the favorable domain shift-robust properties of deep learning methods, and develop a low-rank parameterized CNN model for end-to-end DG learning. Secondly, we develop a DG benchmark dataset covering photo, sketch, cartoon and painting domains. This is both more practically relevant, and harder (bigger domain shift) than existing benchmarks. The results show that our method outperforms existing DG alternatives, and our dataset provides a more significant DG challenge to drive future research.

PDF Abstract

Deeper, Broader, and Artier Domain Generalization

The paper "Deeper, Broader, and Artier Domain Generalization" addresses the persistent challenge of domain generalization (DG), which aims to develop models capable of performing well on unseen domains by leveraging knowledge from multiple training domains. This problem is particularly pronounced in fields like sketch recognition, where training data may be sparse and inherently different from more conventional domains such as photo imagery.

Contributions and Methodology

The authors make two principal contributions to the domain generalization literature. First, they present a low-rank parameterized convolutional neural network (CNN) model tailored for end-to-end DG learning. Second, they introduce a novel and more challenging benchmark dataset known as PACS, which encompasses Photo (P), Art painting (A), Cartoon (C), and Sketch (S) domains. This dataset is meticulously designed to embody significant domain shifts, thus better representing real-world DG challenges compared to existing, photo-centric benchmarks.

Low-Rank Parameterized CNN Model:

The proposed CNN model is innovatively parameterized using a low-rank approach that limits the growth of parameters to mitigate overfitting. It extracts domain-agnostic features and classifiers through a dynamic parameterization mechanism. Each layer's parameters are generated by a function that leverages a binary encoding of domains. This strategy balances the need for domain-specific customization and domain-agnostic generalization, significantly outperforming previous models.

The model uses a Tucker decomposition to control parameter complexity, which automatically discovers the degree of sharing necessary between domains for each CNN layer. This method capitalizes on the favorable properties of deep learning techniques for robustness against domain shifts.

Experimental Results

The effectiveness of the proposed model is validated on the PACS dataset and the established VLCS dataset. The findings can be summarized as follows:

PACS Benchmark:

The PACS dataset introduces varied domain shifts, with categories like dogs, elephants, and houses presented as sketches, cartoons, paintings, and photos. The experimental results demonstrate that the proposed model significantly outperforms baselines and current state-of-the-art DG methods. For instance, the low-rank parameterized CNN achieves higher average accuracy in DG tasks compared to a straightforward fine-tuning approach.

VLCS Benchmark:

On the VLCS dataset, which includes Caltech, LabelMe, Pascal VOC 2007, and SUN09 domains, the proposed model also shows favorable results. It consistently surpasses traditional methods like Undo-Bias, uDICA, UML, LRE-SVM, and MTAE+1HNN in multi-class classification accuracy.

Implications and Future Directions

The robust performance of the low-rank parameterized CNN on diverse and challenging DG tasks offers promising implications for the field. The demonstrated ability to generalize across highly abstract visual domains could prove crucial for applications where acquiring extensive domain-specific data is impractical.

The PACS dataset itself is a valuable contribution, setting a new standard for DG benchmarks. Its broader applicability and more substantial domain shifts encourage the development of more sophisticated DG methodologies.

Future research could focus on expanding the PACS dataset to include more categories and exploring additional domains to further stretch the limits of DG models. Furthermore, integrating this approach with other advances in meta-learning and semi-supervised learning could potentially yield even more robust models capable of handling the nuances of unseen domains more effectively.

Conclusion

The paper provides substantial progress in domain generalization through a novel CNN parameterization technique and an enhanced benchmark dataset. This combination presents a comprehensive framework that addresses fundamental limitations in previous studies and sets the stage for future advancements in the field.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Da Li (96 papers)
Yongxin Yang (73 papers)
Yi-Zhe Song (120 papers)
Timothy M. Hospedales (69 papers)

Citations (1,308)

View on Semantic Scholar

Deeper, Broader and Artier Domain Generalization (1710.03077v1)