Adaptive Consistency Regularization for Semi-Supervised Transfer Learning (2103.02193v2)

Published 3 Mar 2021 in cs.CV and cs.LG

Abstract: While recent studies on semi-supervised learning have shown remarkable progress in leveraging both labeled and unlabeled data, most of them presume a basic setting of the model is randomly initialized. In this work, we consider semi-supervised learning and transfer learning jointly, leading to a more practical and competitive paradigm that can utilize both powerful pre-trained models from source domain as well as labeled/unlabeled data in the target domain. To better exploit the value of both pre-trained weights and unlabeled target examples, we introduce adaptive consistency regularization that consists of two complementary components: Adaptive Knowledge Consistency (AKC) on the examples between the source and target model, and Adaptive Representation Consistency (ARC) on the target model between labeled and unlabeled examples. Examples involved in the consistency regularization are adaptively selected according to their potential contributions to the target task. We conduct extensive experiments on popular benchmarks including CIFAR-10, CUB-200, and MURA, by fine-tuning the ImageNet pre-trained ResNet-50 model. Results show that our proposed adaptive consistency regularization outperforms state-of-the-art semi-supervised learning techniques such as Pseudo Label, Mean Teacher, and FixMatch. Moreover, our algorithm is orthogonal to existing methods and thus able to gain additional improvements on top of MixMatch and FixMatch. Our code is available at https://github.com/SHI-Labs/Semi-Supervised-Transfer-Learning.

PDF Abstract

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

The paper "Adaptive Consistency Regularization for Semi-Supervised Transfer Learning" proposes a novel approach extending semi-supervised learning (SSL) techniques to leverage the strengths of transfer learning. It introduces a framework that synergizes the principles of consistency regularization in SSL with the inductive transfer capabilities of pre-trained models. This methodological fusion results in improved performance on downstream tasks, exploiting both labeled and unlabeled data in the target domain.

The core contribution of this work lies in the introduction of Adaptive Consistency Regularization (ACR), which integrates two components: Adaptive Knowledge Consistency (AKC) and Adaptive Representation Consistency (ARC). AKC focuses on distilling knowledge between the source and target tasks through the pre-trained feature representation, selectively using samples based on their significance as assessed by an entropy-based gating mechanism. This bypasses the negative transfer effects that can occur due to task discrepancies. Meanwhile, ARC uses the Maximum Mean Discrepancy (MMD) to ensure that labeled and unlabeled data are mapped to a coherent representation space, again leveraging adaptive sampling techniques to filter pertinent data points.

The experimental evaluations underscore the efficacy of ACR in enhancing the performance of state-of-the-art SSL methods. Testing was performed on multiple benchmarks including CIFAR-10, CUB-200-2011, and MURA, with results indicating superior performance when ACR is applied, particularly in settings with limited labeled data. Noteworthy is ACR's ability to amplify the capabilities of existing SSL strategies such as MixMatch and FixMatch, revealing its orthogonality to these methods and the potential for additional performance gains.

From a practical standpoint, ACR's implications are significant for scenarios where fine-tuning pre-trained models to domain-specific tasks with limited labeled data is essential. The adaptive nature of ACR addresses the imbalance typically observed in semi-supervised transfer learning, where generalization is a challenge due to sample scarcity or domain shift.

Theoretically, this paper advances our understanding of how consistency regularization can be effectively married to transfer learning paradigms—pointing toward a strategic level of supervision that isn't heavily reliant on massive labeled datasets. Through careful selection of influential examples between domains, the framework achieves both depth and flexibility, driving advancements in both SSL and transfer learning intersection.

Future research might explore extending the adaptive principles of ACR to broader transfer learning setups, including few-shot learning scenarios or integrating with unsupervised pre-training methods. Furthermore, employing this framework in non-vision tasks could verify its versatility and assess its potential to generalize across different types of data.

In conclusion, this paper provides a salient bridge between two rapidly evolving areas in machine learning. It opens avenues for exploiting pre-trained models with a limited labeled dataset, thereby circumventing some of the major obstacles present in traditional approaches. As the landscape of AI shifts toward increasingly efficient use of labeled resources, the insights from this work underscore the importance of agility and precision in learning paradigms.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Abulikemu Abuduweili (19 papers)
Xingjian Li (49 papers)
Humphrey Shi (97 papers)
Cheng-Zhong Xu (45 papers)
Dejing Dou (112 papers)

Citations (70)

View on Semantic Scholar

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning (2103.02193v2)

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Related Papers

GitHub

YouTube