A Survey on Deep Transfer Learning (1808.01974v1)

Published 6 Aug 2018 in cs.LG and stat.ML

Abstract: As a new classification platform, deep learning has recently received increasing attention from researchers and has been successfully applied to many domains. In some domains, like bioinformatics and robotics, it is very difficult to construct a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation, which limits its development. Transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data. This survey focuses on reviewing the current researches of transfer learning by using deep neural network and its applications. We defined deep transfer learning, category and review the recent research works based on the techniques used in deep transfer learning.

Authors (6)

Chuanqi Tan (56 papers)
Fuchun Sun (127 papers)
Tao Kong (49 papers)
Wenchang Zhang (5 papers)
Chao Yang (333 papers)
Chunfang Liu (3 papers)

Citations (2,434)

View on Semantic Scholar

Summary

A Survey on Deep Transfer Learning

The paper "A Survey on Deep Transfer Learning" by Chuanqi Tan et al. offers a comprehensive review of the advancements in deep transfer learning and its various applications. Deep learning, while highly effective, is notoriously dependent on vast amounts of labeled data. This dependency introduces significant challenges in domains where data collection is complex and expensive. Transfer learning provides a promising solution by allowing models to leverage knowledge from related domains with more ample data, thus circumventing the need for extensive labeled data in the target domain.

Definition and Categories of Deep Transfer Learning

The authors define deep transfer learning and categorize it into four primary techniques:

Instances-based Deep Transfer Learning: This technique assigns weights to instances from the source domain to integrate them into the target domain's training dataset. This approach assumes that some instances in the source domain are sufficiently similar to those in the target domain and can be used effectively with appropriate weighting.
Mapping-based Deep Transfer Learning: This method maps instances from both source and target domains into a new data space where similarities are enhanced. This unified data space facilitates the application of a single deep neural network to both domains.
Network-based Deep Transfer Learning: Here, parts of a network pre-trained on the source domain are reused in the target domain, exploiting the assumption that initial layers of a network can generalize well across different domains as feature extractors.
Adversarial-based Deep Transfer Learning: This approach uses adversarial training techniques to extract features that are effective for both domains. The adversarial network attempts to distinguish between source and target domain features, and the main network is trained to generate indistinguishable features to improve transferability.

Key Contributions and Notable Techniques

Instances-based Techniques

TrAdaBoost and its variants, such as ExpBoost.R2 and TrAdaBoost.R2, are prominent examples. These methods utilize boosted instances to reduce training error across domains. For regression tasks, the adaptations of TrAdaBoost have shown efficacy, indicating the versatility of instance-based methods.

Mapping-based Techniques

One notable advancement is the extension of Transfer Component Analysis (TCA) to deep learning. Techniques such as those proposed by Tzeng et al. (2014) and Long et al. (2015, 2016) have utilized Maximum Mean Discrepancy (MMD) and its variants to minimize distribution differences between domains. These methods map feature spaces into Reproducing Kernel Hilbert Spaces (RKHS) and optimize neural networks to reduce cross-domain discrepancies.

Network-based Techniques

Network reuse has shown effectiveness in multiple domains, from image recognition to language processing. Work by Huang et al. (2013) and Oquab et al. (2014) demonstrates how pre-trained layers can be fine-tuned for new tasks, thus significantly cutting down training times and improving model efficacy. The conclusions by Yosinski et al. (2014) on the transferability of network features emphasize the importance of network architecture for effective transfer learning.

Adversarial-based Techniques

The utilization of Generative Adversarial Networks (GANs) has introduced robust mechanisms for domain adaptation. Techniques proposed by Ajakan et al. (2014) and Ganin et al. (2014) leverage adversarial training to minimize domain discrepancies, showcasing the potential of adversarial-based methods in achieving high transferability.

Implications and Future Directions

The survey underlines that most current research in the field has been focused on supervised learning. However, the potential for deep transfer learning in unsupervised and semi-supervised settings is vast and currently underexplored. Future research could evolve to address negative transfer and develop robust transferability measures. Furthermore, interdisciplinary collaborations involving physicists, neuroscientists, and computer scientists could yield deeper insights into the underlying mechanisms of transfer learning, enhancing its theoretical and practical robustness.

In conclusion, deep transfer learning, with its defined categories and notable methodologies, has shown the potential to revolutionize how we tackle data-scarcity challenges across numerous domains. As the field evolves, we anticipate seeing wider applications and more innovative techniques emerging, which will solidify deep transfer learning as a cornerstone in the landscape of artificial intelligence.

PDF Markdown

Related Papers

YouTube

Show All Videos