EnAET: A Self-Trained framework for Semi-Supervised and Supervised Learning with Ensemble Transformations (1911.09265v2)

Published 21 Nov 2019 in cs.CV

Abstract: Deep neural networks have been successfully applied to many real-world applications. However, such successes rely heavily on large amounts of labeled data that is expensive to obtain. Recently, many methods for semi-supervised learning have been proposed and achieved excellent performance. In this study, we propose a new EnAET framework to further improve existing semi-supervised methods with self-supervised information. To our best knowledge, all current semi-supervised methods improve performance with prediction consistency and confidence ideas. We are the first to explore the role of {\bf self-supervised} representations in {\bf semi-supervised} learning under a rich family of transformations. Consequently, our framework can integrate the self-supervised information as a regularization term to further improve {\it all} current semi-supervised methods. In the experiments, we use MixMatch, which is the current state-of-the-art method on semi-supervised learning, as a baseline to test the proposed EnAET framework. Across different datasets, we adopt the same hyper-parameters, which greatly improves the generalization ability of the EnAET framework. Experiment results on different datasets demonstrate that the proposed EnAET framework greatly improves the performance of current semi-supervised algorithms. Moreover, this framework can also improve {\bf supervised learning} by a large margin, including the extremely challenging scenarios with only 10 images per class. The code and experiment records are available in \url{https://github.com/maple-research-lab/EnAET}.

View on arXiv

Authors (4)

Xiao Wang (507 papers)
Daisuke Kihara (16 papers)
Jiebo Luo (355 papers)
Guo-Jun Qi (76 papers)

Citations (33)

View on Semantic Scholar

Summary

An Analysis of the EnAET Framework for Semi-Supervised and Supervised Learning

The paper of semi-supervised learning has garnered significant attention due to its potential to mitigate the substantial annotated data requirements inherent in deep learning paradigms. The paper "EnAET: A Self-Trained framework for Semi-Supervised and Supervised Learning with Ensemble Transformations" by Wang et al. introduces a self-trained framework, EnAET, that integrates self-supervised representation learning with semi-supervised methodologies to enhance learning performance.

Overview of Methodological Innovations

The EnAET framework capitalizes on the synergy between self-supervised and semi-supervised learning by implementing an ensemble of auto-encoding transformations. This self-training framework advances the state-of-the-art semi-supervised learning method MixMatch by incorporating self-supervised signals through spatial and non-spatial transformation ensembles. The paper primarily highlights the self-supervised learning of representations by recurrently decoding transformation parameters, making it a novel approach in semi-supervised settings.

Key Contributions:

Ensemble Transformations: The paper employs an array of spatial (e.g., projective, affine, similarity, and Euclidean) and non-spatial (e.g., color, contrast, brightness, sharpness) transformations in a novel manner to train semi-supervised classifiers. Notably, this strategy does not require labeled data, making it a highly attractive approach to extracting informative features under limited label scenarios.
Auto-Encoding Transformation (AET) Loss: The AET loss is utilized to align encoded features by reconstructing transformation parameters. This component acts as a pseudo-label, thus serving as a regularization term, which is pivotal to the EnAET model's effectiveness.
Consistency Loss: The framework also incorporates a consistency loss designed to ensure consistent model predictions across original and transformed data, further enhancing the model's robustness.

Numerical Results

The paper exhaustively tests the proposed framework across various datasets including CIFAR-10, CIFAR-100, STL-10, and SVHN. The results demonstrate substantial improvements over baseline models, with marked reductions in error rates. For instance, with CIFAR-10, the EnAET framework achieves an error rate of 7.6% with merely 250 labeled examples, surpassing baseline MixMatch's performance at 11.08%. The robust performance is consistent across diverse experimental conditions, indicating the method's generalizability.

Implications and Future Directions

The implications of EnAET are multifaceted, extending theoretical frameworks of semi-supervised learning and practical applications in contexts where labeled data is sparse or expensive to obtain. The introduction of a self-trained regularization component using parameterized transformations could spur further examination of other transformation types and their potential contributions to learning algorithms.

Future research might explore optimizing the ensemble of transformations or explore its integration with other self-supervised learning methods. This could potentially propel developments in areas such as computer vision, natural language processing, and beyond, particularly in settings constrained by data availability.

In sum, the EnAET framework represents a significant stride in semi-supervised and self-supervised learning. It leverages ensemble transformations to create a robust framework, effectively harnessing unlabeled data to bridge the performance gap with fully supervised models. This capability makes it a relevant topic for ongoing research and application in AI-driven fields.

Related Papers

Find Related Papers

GitHub

GitHub - maple-research-lab/EnAET: EnAET: Self-Trained Ensemble AutoEncoding Transformations for Semi-Supervised Learning (81 stars)