Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Predictor for Neural Architecture Search (1912.00848v1)

Published 2 Dec 2019 in cs.LG and stat.ML

Abstract: Neural Architecture Search methods are effective but often use complex algorithms to come up with the best architecture. We propose an approach with three basic steps that is conceptually much simpler. First we train N random architectures to generate N (architecture, validation accuracy) pairs and use them to train a regression model that predicts accuracy based on the architecture. Next, we use this regression model to predict the validation accuracies of a large number of random architectures. Finally, we train the top-K predicted architectures and deploy the model with the best validation result. While this approach seems simple, it is more than 20 times as sample efficient as Regularized Evolution on the NASBench-101 benchmark and can compete on ImageNet with more complex approaches based on weight sharing, such as ProxylessNAS.

Citations (188)

Summary

  • The paper introduces a novel three-step Neural Predictor that minimizes computational demands while identifying high-performing architectures.
  • The method trains a regression model on random architectures to predict validation accuracy, streamlining the search for optimal candidates.
  • Empirical results demonstrate over 20x sample efficiency improvements on NASBench-101 and competitive performance on ImageNet.

Overview of "Neural Predictor for Neural Architecture Search"

The paper presents a novel methodology aimed at optimizing Neural Architecture Search (NAS) using a technique referred to as the Neural Predictor. Traditional NAS methods, although effective in improving model accuracy, suffer from high computational demands due to their complex algorithmic designs. This research addresses these limitations by offering a simpler three-step procedure that achieves competitive results with significantly reduced resource consumption.

Methodology

The proposed approach is structured into a series of interconnected steps:

  1. Training and Data Collection: A set of NN random architectures is trained to create a dataset of pairs, each comprising an architecture and its corresponding validation accuracy. This dataset is then used to train a regression model capable of predicting the accuracy for a given architecture.
  2. Prediction and Selection: Leveraging the regression model, validation accuracies are predicted for a large pool of random architectures. This step allows researchers to efficiently identify KK promising candidates based on predicted accuracy.
  3. Final Validation: The top KK architectures, based on the predictions, are trained and verified explicitly to select the best-performing architecture for deployment.

This streamlined method is demonstrated to be over 20 times more sample-efficient than traditional Regularized Evolution on the NASBench-101 benchmark, with competitive results on ImageNet compared to more intricate weight-sharing approaches like ProxylessNAS.

Results and Implications

The empirical evaluation of the Neural Predictor leverages benchmarks such as NASBench-101 and ImageNet to validate its efficacy. Key findings include:

  • NASBench-101 Benchmark: The Neural Predictor demonstrates remarkable sample efficiency, outperforming other state-of-the-art methods including Regularized Evolution, which it surpasses in terms of accuracy and computational requirement significantly.
  • ImageNet: On a broader, practical scale, the model selection process enabled by the Neural Predictor resulted in the identification of high-quality architectures, equating to those discovered by weight-sharing methods like ProxylessNAS.

Theoretical and Practical Implications

The Neural Predictor's framework introduces a pragmatic approach to NAS, capitalizing on two powerful machine learning tools—random sampling and supervised learning. Unlike other NAS approaches, this methodology avoids reliance on reinforcement learning, weight sharing, or Bayesian optimization, thus simplifying the implementation while maintaining robustness in model selection.

Practically, this efficiency allows for the utilization of NAS by smaller research groups and practitioners limited by computational budgets, thereby democratizing access to advanced NAS tools. The approach's parallelizable nature further enhances its appeal for deployments in high-performance computing environments.

Future Directions

The paper paves the way for future research exploring the extension of Neural Predictors beyond the search spaces addressed here, facilitated by customizable architectures of Graph Convolutional Networks (GCNs) used within the predictor framework. Considering the demonstrated generalizability and efficiency, further exploration into hybrid models integrating various NAS techniques could refine predictive performance and extend potential applicability to diverse domains within artificial intelligence.

In summary, the Neural Predictor represents a substantial advancement in the field of neural architecture search, providing both practical tools for efficient model selection and a foundation for future research into more complex, adaptive prediction models.

X Twitter Logo Streamline Icon: https://streamlinehq.com