Balanced One-shot Neural Architecture Optimization (1909.10815v2)

Published 24 Sep 2019 in cs.LG, cs.NE, and stat.ML

Abstract: The ability to rank candidate architectures is the key to the performance of neural architecture search~(NAS). One-shot NAS is proposed to reduce the expense but shows inferior performance against conventional NAS and is not adequately stable. We investigate into this and find that the ranking correlation between architectures under one-shot training and the ones under stand-alone full training is poor, which misleads the algorithm to discover better architectures. Further, we show that the training of architectures of different sizes under the current one-shot method is imbalanced, which causes the evaluated performances of the architectures to be less predictable of their ground-truth performances and affects the ranking correlation heavily. Consequently, we propose Balanced NAO where we introduce balanced training of the supernet during the search procedure to encourage more updates for large architectures than small architectures by sampling architectures in proportion to their model sizes. Comprehensive experiments verify that our proposed method is effective and robust which leads to a more stable search. The final discovered architecture shows significant improvements against baselines with a test error rate of 2.60\% on CIFAR-10 and top-1 accuracy of 74.4% on ImageNet under the mobile setting. Code and model checkpoints will be publicly available. The code is available at github.com/renqianluo/NAO_pytorch.

Citations (14)

View on Semantic Scholar

Summary

The paper introduces Balanced NAO, a method that improves ranking correlation in one-shot NAS through balanced training.
It addresses insufficient optimization and imbalanced training by sampling architectures proportionally to their model sizes.
Experimental results show improved performance with a 2.60% error on CIFAR-10 and 74.4% top-1 accuracy on ImageNet.

Balanced One-shot Neural Architecture Optimization

The paper "Balanced One-shot Neural Architecture Optimization" by Renqian Luo, Tao Qin, and Enhong Chen addresses critical challenges in Neural Architecture Search (NAS), with a particular focus on the shortcomings of one-shot NAS methods. It highlights the inadequacies observed in the ranking correlation of candidate architectures during one-shot training as compared to their fully trained stand-alone counterparts.

The authors identify two primary issues affecting one-shot NAS: insufficient optimization and imbalanced training of architectures. Insufficient optimization arises because the average training time allotted to individual architectures within the supernet is insufficient for accurate performance ranking. This is attributed to the simultaneous inclusion of numerous architectures in the supernet, which leads to inadequate training time for each. The paper presents empirical evidence indicating that the ranking correlation between candidate architectures during one-shot and stand-alone training is suboptimal, undermining the search algorithm's effectiveness in identifying truly optimal architectures.

Furthermore, the paper discusses imbalanced training, where smaller architectures tend to be favored over larger ones due to their ease of optimization within shorter training epochs. This imbalance leads to misleading evaluations, where smaller architectures may appear superior in the one-shot setting but do not generalize well when fully trained.

To address these issues, the authors propose Balanced NAO (Neural Architecture Optimization), an enhancement on one-shot NAS that incorporates a balanced training approach. Balanced NAO samples architectures in proportion to their model sizes during the search procedure. This ensures that larger architectures receive proportional attention and updates equivalent to their complexity, thereby mitigating the imbalances introduced by traditional one-shot methods.

The experimental results presented in the paper validate the efficacy and robustness of Balanced NAO. The method demonstrates substantial improvements in performance and stability compared to traditional one-shot NAS techniques such as ENAS and DARTS. Specifically, the paper reports a noteworthy test error rate of 2.60% on the CIFAR-10 dataset and top-1 accuracy of 74.4% on the ImageNet dataset under the mobile setting. These results highlight not only the practical impact of improving one-shot NAS approaches but also the potential for future advancements in NAS methodologies.

Balanced NAO offers a significant contribution towards more efficient and reliable NAS processes. By optimizing the training balance, it advances our capability to discover better neural architectures without the excessive computational costs associated with exhaustive search methods. Future research can extend this approach by refining the sampling techniques or integrating it with other emerging NAS strategies to further enhance the effectiveness and efficiency in automatic architecture design. Additionally, exploring its adaptability to various domains beyond image classification, such as natural language processing or reinforcement learning, could broaden its applicability.

In synthesis, the Balanced NAO method distinctively addresses the skewed training dynamics of one-shot NAS, demonstrating improved reliability in architecture discovery. It sets a foundation for the community to build upon, pushing towards more optimized and innovative neural architectures.

PDF Markdown

Related Papers

GitHub

GitHub - renqianluo/NAO_pytorch: Pytorch Implementation of Neural Architecture Optimization (112 stars)