Understanding the Role of Invariance in Transfer Learning (2407.04325v1)

Published 5 Jul 2024 in cs.LG

Abstract: Transfer learning is a powerful technique for knowledge-sharing between different tasks. Recent work has found that the representations of models with certain invariances, such as to adversarial input perturbations, achieve higher performance on downstream tasks. These findings suggest that invariance may be an important property in the context of transfer learning. However, the relationship of invariance with transfer performance is not fully understood yet and a number of questions remain. For instance, how important is invariance compared to other factors of the pretraining task? How transferable is learned invariance? In this work, we systematically investigate the importance of representational invariance for transfer learning, as well as how it interacts with other parameters during pretraining. To do so, we introduce a family of synthetic datasets that allow us to precisely control factors of variation both in training and test data. Using these datasets, we a) show that for learning representations with high transfer performance, invariance to the right transformations is as, or often more, important than most other factors such as the number of training samples, the model architecture and the identity of the pretraining classes, b) show conditions under which invariance can harm the ability to transfer representations and c) explore how transferable invariance is between tasks. The code is available at \url{https://github.com/tillspeicher/representation-invariance-transfer}.

Authors (3)

Till Speicher (10 papers)
Vedant Nanda (16 papers)
Krishna P. Gummadi (68 papers)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - tillspeicher/representation-invariance-transfer: Code for the TMLR paper "Understanding the Role of Invariance in Transfer Learning"

Understanding the Role of Invariance in Transfer Learning (2407.04325v1)

Summary

Related Papers

GitHub

Tweets