BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search (1910.11858v3)

Published 25 Oct 2019 in cs.LG, cs.NE, and stat.ML

Abstract: Over the past half-decade, many methods have been considered for neural architecture search (NAS). Bayesian optimization (BO), which has long had success in hyperparameter optimization, has recently emerged as a very promising strategy for NAS when it is coupled with a neural predictor. Recent work has proposed different instantiations of this framework, for example, using Bayesian neural networks or graph convolutional networks as the predictive model within BO. However, the analyses in these papers often focus on the full-fledged NAS algorithm, so it is difficult to tell which individual components of the framework lead to the best performance. In this work, we give a thorough analysis of the "BO + neural predictor" framework by identifying five main components: the architecture encoding, neural predictor, uncertainty calibration method, acquisition function, and acquisition optimization strategy. We test several different methods for each component and also develop a novel path-based encoding scheme for neural architectures, which we show theoretically and empirically scales better than other encodings. Using all of our analyses, we develop a final algorithm called BANANAS, which achieves state-of-the-art performance on NAS search spaces. We adhere to the NAS research checklist (Lindauer and Hutter 2019) to facilitate best practices, and our code is available at https://github.com/naszilla/naszilla.

PDF Abstract

Analyzing the "BO + Neural Predictor" Framework for Neural Architecture Search

The paper "BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search" provides a comprehensive examination of the integration of Bayesian optimization with neural predictors, a promising framework for automating the search for optimal neural architectures. The paper dissects this framework into five key components: architecture encoding, neural predictor, uncertainty calibration method, acquisition function, and acquisition optimization strategy. The authors strive to elucidate the influence of each component on the framework's overall performance, departing from previous work that often bundled analyses into a monolithic assessment.

Core Contributions and Methodologies

One of the standout contributions is the introduction of a novel path-based encoding scheme for neural architectures. This encoding captures unique paths from input to output nodes within the architecture's cell, which the authors argue provides a more scalable representation than traditional adjacency matrix encodings. Theoretical and empirical evidence presented in the paper supports this assertion, showing that the path-based encoding improves prediction accuracy of neural predictors in the Bayesian optimization (BO) loop.

In their experimental setup, the authors use split trials and ensemble methods to evaluate component efficacy. They compare several neural predictors, including feedforward neural networks, graph convolutional networks (GCN), and variational autoencoder (VAE)-based networks. Encouragingly, results indicate that simple feedforward networks with their novel path encoding achieve competitive predictive performance. To extend this comparison into the full BO framework, they incorporate both Bayesian neural networks and ensemble techniques to generate uncertainty estimates, crucial for efficient exploration-exploitation balance in NAS.

Experimental Insights

One compelling result is the performance of the "BANANAS" algorithm devised by the authors, which consistently reaches state-of-the-art results on the NASBench-101 and NASBench-201 datasets. BANANAS leverages the identified top-performing framework configuration: feedforward neural networks with path encoding, ITS for acquisition, and mutation for acquisition optimization. On both small (NASBench-201) and large (DARTS) search spaces, BANANAS demonstrates robust performance, reinforcing the adaptability and efficacy of the presented framework.

The authors’ approach to acquisition function and optimization showcases a blend of both established and innovative methods, notably incorporating Thompson sampling variants which aid in parallelization across computational resources—a key consideration in practical NAS applications.

Implications and Future Directions

The research presented has significant implications for both theoretical and practical advancements in NAS. The meticulous decomposition of the BO + neural predictor framework and subsequent empirical inquiry provide a clearer roadmap for enhancing each component. The path encoding, in particular, stands as a novel contribution that, with its demonstrated scalability benefits, could influence future developments in encoding strategies.

Practically, the work points towards more efficient NAS algorithms capable of optimizing architectures in a diverse range of search spaces without extensive tuning. This positions BANANAS and its methodological insights as valuable tools for both academia and industry efforts focusing on automated machine learning (AutoML).

Looking forward, there is room to explore multi-fidelity extensions of BANANAS, potentially incorporating early stopping methods or other computational efficiency strategies to further reduce the time-to-solution in NAS tasks. The alignment of these developments with ongoing innovations in neural architecture substrates (e.g., transformer models, few-shot learning frameworks) remains a promising avenue for exploration.

In summary, the paper offers a rigorous assessment of the BO + neural predictor strategy and underlines a pathway for implementing effective NAS through careful component evaluation and novel encoding methodologies, contributing markedly to the automated design of pioneering deep learning architectures.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Colin White (34 papers)
Willie Neiswanger (68 papers)
Yash Savani (10 papers)

Citations (288)

View on Semantic Scholar

BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search (1910.11858v3)

Analyzing the "BO + Neural Predictor" Framework for Neural Architecture Search

Core Contributions and Methodologies

Experimental Insights

Implications and Future Directions

Related Papers

GitHub

YouTube