NAS-Bench-101: Towards Reproducible Neural Architecture Search (1902.09635v2)

Published 25 Feb 2019 in cs.LG and stat.ML

Abstract: Recent advances in neural architecture search (NAS) demand tremendous computational resources, which makes it difficult to reproduce experiments and imposes a barrier-to-entry to researchers without access to large-scale computation. We aim to ameliorate these problems by introducing NAS-Bench-101, the first public architecture dataset for NAS research. To build NAS-Bench-101, we carefully constructed a compact, yet expressive, search space, exploiting graph isomorphisms to identify 423k unique convolutional architectures. We trained and evaluated all of these architectures multiple times on CIFAR-10 and compiled the results into a large dataset of over 5 million trained models. This allows researchers to evaluate the quality of a diverse range of models in milliseconds by querying the pre-computed dataset. We demonstrate its utility by analyzing the dataset as a whole and by benchmarking a range of architecture optimization algorithms.

PDF Abstract

Analysis of NAS-Bench-101: Towards Reproducible Neural Architecture Search

The paper "NAS-Bench-101: Towards Reproducible Neural Architecture Search" presents a notable contribution to the field of Neural Architecture Search (NAS) by introducing the NAS-Bench-101 dataset. This dataset is a public benchmark explicitly constructed to address significant challenges in NAS research such as reproducibility and computational resource demands. The paper outlines the creation of an architecture search space, evaluates a comprehensive set of models within this space, and demonstrates various use cases of the dataset.

Key Contributions

The paper highlights several essential contributions:

Introduction of a Large-Scale Dataset: NAS-Bench-101 is the first public dataset designed for NAS research, containing results from over 5 million trained convolutional neural network models on CIFAR-10. Each of these architectures is trained multiple times to provide robust performance metrics, facilitating reproducible experimentation without extensive computational resources.
Search Space Definition: The search space is carefully constructed to include a diverse range of architectures. The authors employ graph isomorphisms to define 423k unique CNN architectures, ensuring the comprehensiveness needed to cover a wide variety of structure possibilities while maintaining manageability.
Publicly Accessible Resources: The dataset, along with the training and evaluation protocols, is made publicly available, promoting transparency and reproducibility of NAS results.
Benchmarking Capabilities: NAS-Bench-101 allows for fast evaluations via dataset queries instead of costly model training processes. It provides a benchmark for evaluating various NAS optimization algorithms, including evolutionary methods, random search, and Bayesian optimization.

Numerical Results and Analysis

The dataset reveals several insights:

Architecture Performance: The best architectures in the dataset achieve a mean test accuracy of 94.32% on CIFAR-10, with typical models achieving over 90%. The dataset provides insight into both the average and variance in performance metrics due to architectural differences.
Training Time and Complexity Correlation: Parameters such as number of trainable model parameters and training time correlate strongly with accuracy. This suggests certain architectural complexities contribute significantly to performance.
Effect of Architectural Changes: Replacing $3 \times 3$ convolutions with smaller operations often results in reduced performance, confirming the crucial role played by deeper convolutions in model efficacy.

Implications and Future Directions

The implications of NAS-Bench-101 extend across both practical and theoretical dimensions. Practically, this dataset lowers barriers to entry in NAS research by minimizing the computational cost typically associated with architecture search. Theoretically, the comprehensive analysis of neural architectures fosters a deeper understanding of how specific architectural properties influence neural network performance.

Moreover, by establishing a reproducible benchmark, NAS-Bench-101 sets a precedent for future benchmarks to evaluate NAS algorithms under standardized conditions. Researchers are encouraged to leverage this dataset to experiment with new NAS methods and compare results against a common baseline.

Future Developments

In future NAS research, one can expect an integration of even larger and more diverse datasets, potentially including a broader set of tasks beyond image classification on CIFAR-10. Additionally, incorporating advanced regularization and training techniques could yield higher performance ceilings. The exploration of novel NAS algorithms, optimized through feedback from NAS-Bench-101, remains an intriguing prospect, promising advancements in both efficiency and model accuracy.

In conclusion, NAS-Bench-101 serves as a critical step toward making neural architecture search more accessible, transparent, and reproducible, while providing an invaluable resource for benchmarking and the exploration of NAS algorithms in automated machine learning (AutoML).

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Chris Ying (6 papers)
Aaron Klein (24 papers)
Esteban Real (15 papers)
Eric Christiansen (5 papers)
Kevin Murphy (87 papers)
Frank Hutter (177 papers)

Citations (636)

View on Semantic Scholar