- The paper introduces Balanced NAO, a method that improves ranking correlation in one-shot NAS through balanced training.
- It addresses insufficient optimization and imbalanced training by sampling architectures proportionally to their model sizes.
- Experimental results show improved performance with a 2.60% error on CIFAR-10 and 74.4% top-1 accuracy on ImageNet.
Balanced One-shot Neural Architecture Optimization
The paper "Balanced One-shot Neural Architecture Optimization" by Renqian Luo, Tao Qin, and Enhong Chen addresses critical challenges in Neural Architecture Search (NAS), with a particular focus on the shortcomings of one-shot NAS methods. It highlights the inadequacies observed in the ranking correlation of candidate architectures during one-shot training as compared to their fully trained stand-alone counterparts.
The authors identify two primary issues affecting one-shot NAS: insufficient optimization and imbalanced training of architectures. Insufficient optimization arises because the average training time allotted to individual architectures within the supernet is insufficient for accurate performance ranking. This is attributed to the simultaneous inclusion of numerous architectures in the supernet, which leads to inadequate training time for each. The paper presents empirical evidence indicating that the ranking correlation between candidate architectures during one-shot and stand-alone training is suboptimal, undermining the search algorithm's effectiveness in identifying truly optimal architectures.
Furthermore, the paper discusses imbalanced training, where smaller architectures tend to be favored over larger ones due to their ease of optimization within shorter training epochs. This imbalance leads to misleading evaluations, where smaller architectures may appear superior in the one-shot setting but do not generalize well when fully trained.
To address these issues, the authors propose Balanced NAO (Neural Architecture Optimization), an enhancement on one-shot NAS that incorporates a balanced training approach. Balanced NAO samples architectures in proportion to their model sizes during the search procedure. This ensures that larger architectures receive proportional attention and updates equivalent to their complexity, thereby mitigating the imbalances introduced by traditional one-shot methods.
The experimental results presented in the paper validate the efficacy and robustness of Balanced NAO. The method demonstrates substantial improvements in performance and stability compared to traditional one-shot NAS techniques such as ENAS and DARTS. Specifically, the paper reports a noteworthy test error rate of 2.60% on the CIFAR-10 dataset and top-1 accuracy of 74.4% on the ImageNet dataset under the mobile setting. These results highlight not only the practical impact of improving one-shot NAS approaches but also the potential for future advancements in NAS methodologies.
Balanced NAO offers a significant contribution towards more efficient and reliable NAS processes. By optimizing the training balance, it advances our capability to discover better neural architectures without the excessive computational costs associated with exhaustive search methods. Future research can extend this approach by refining the sampling techniques or integrating it with other emerging NAS strategies to further enhance the effectiveness and efficiency in automatic architecture design. Additionally, exploring its adaptability to various domains beyond image classification, such as natural language processing or reinforcement learning, could broaden its applicability.
In synthesis, the Balanced NAO method distinctively addresses the skewed training dynamics of one-shot NAS, demonstrating improved reliability in architecture discovery. It sets a foundation for the community to build upon, pushing towards more optimized and innovative neural architectures.