Adversarial AutoAugment (1912.11188v1)

Published 24 Dec 2019 in cs.CV, cs.LG, and stat.ML

Abstract: Data augmentation (DA) has been widely utilized to improve generalization in training deep neural networks. Recently, human-designed data augmentation has been gradually replaced by automatically learned augmentation policy. Through finding the best policy in well-designed search space of data augmentation, AutoAugment can significantly improve validation accuracy on image classification tasks. However, this approach is not computationally practical for large-scale problems. In this paper, we develop an adversarial method to arrive at a computationally-affordable solution called Adversarial AutoAugment, which can simultaneously optimize target related object and augmentation policy search loss. The augmentation policy network attempts to increase the training loss of a target network through generating adversarial augmentation policies, while the target network can learn more robust features from harder examples to improve the generalization. In contrast to prior work, we reuse the computation in target network training for policy evaluation, and dispense with the retraining of the target network. Compared to AutoAugment, this leads to about 12x reduction in computing cost and 11x shortening in time overhead on ImageNet. We show experimental results of our approach on CIFAR-10/CIFAR-100, ImageNet, and demonstrate significant performance improvements over state-of-the-art. On CIFAR-10, we achieve a top-1 test error of 1.36%, which is the currently best performing single model. On ImageNet, we achieve a leading performance of top-1 accuracy 79.40% on ResNet-50 and 80.00% on ResNet-50-D without extra data.

PDF Abstract

An Expert Overview of "Adversarial AutoAugment"

The research paper titled "Adversarial AutoAugment" by Xinyu Zhang, Qiang Wang, Jian Zhang, and Zhao Zhong addresses a fundamental challenge in neural network design: optimizing data augmentation strategies efficiently. Building on the foundations laid by AutoAugment, this work introduces a novel adversarial approach to increase computational efficiency and enhance model generalization on large-scale datasets such as ImageNet.

Core Contributions

This paper presents several advancements over previous data augmentation techniques:

Adversarial Framework: Unlike traditional methods where augmentation policies are manually designed or fixed, this work employs an adversarial approach. An augmentation policy network (akin to an adversary) seeks to maximize the training loss of a target network by generating challenging augmented examples. The target network, in turn, optimally learns more robust features by confronting these harder instances, thus improving generalization.
Efficiency Improvements: By strategically reusing computational steps from target network training, this method significantly decreases the computational cost and time overhead associated with policy searches. This marks a departure from techniques like AutoAugment, which require resource-intensive searches involving training from scratch for policy evaluation. The paper reports a substantial reduction, approximately 12 times in computational cost and 11 times in time overhead compared to AutoAugment, when training models like ResNet-50 on ImageNet.
Empirical Superiority: The proposed method demonstrates commendable results across CIFAR-10/CIFAR-100 and ImageNet. Notably, it achieves a top-1 test error of 1.36% on CIFAR-10 using PyramidNet+ShakeDrop, surpassing state-of-the-art results. On ImageNet, the approach achieves a top-1 accuracy of 79.40% with ResNet-50, highlighting its applicability in training competitive large-scale models without requiring extra data.

Methodology

The methodology incorporates a Min-Max game framework between the augmentation policy network and the target network. The dynamic adaptation of the augmentation policy throughout the training process leads to a more efficient training regimen:

Search Space and Dynamics: The solution adapts AutoAugment's predefined search space but enhances it with dynamic learning. Unchecked fluctuations in policy application are avoided by not fixing probabilities for operations, fostering more instantaneous response to network state.
Joint Optimization: The augmentation policy network optimizes its action via reinforcement learning principles, particularly using the REINFORCE algorithm. This training mechanism, aided by a large batch strategy involving several augmentations per instance, improves both convergence speed and model robustness.

Implications and Future Directions

The implications for practical and theoretical advancements in AI are multifaceted. For practitioners, the promise of achieving near real-time policy adaptation with drastically reduced computational costs offers opportunities for deploying this methodology in various application domains experiencing similar model generalization bottlenecks. Theoretically, the adversarial positioning of policy and network training could catalyze further explorations into adversarial learning paradigms beyond augmentation alone.

Common challenges, such as across-dataset transferability limitations of augmentation policies, are tackled through preliminary experiments showing competitive results even with straightforward policy transfers across different datasets and architectures. This underlines the method's potential robustness and adaptability.

In conclusion, "Adversarial AutoAugment" propels automatic data augmentation forward by introducing an adversarial model that emphasizes efficiency and robustness in training large-scale neural networks. Its pioneering framework and significant reductions in resource consumption set a foundational basis for future research to build upon, suggesting that similar adversarial strategies could be explored and expanded to other facets of machine learning model optimization.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Xinyu Zhang (296 papers)
Qiang Wang (271 papers)
Jian Zhang (543 papers)
Zhao Zhong (14 papers)

Citations (188)

View on Semantic Scholar

Adversarial AutoAugment (1912.11188v1)

An Expert Overview of "Adversarial AutoAugment"

Core Contributions

Methodology

Implications and Future Directions

Related Papers