A Study on Encodings for Neural Architecture Search (2007.04965v2)

Published 9 Jul 2020 in cs.LG, cs.NE, and stat.ML

Abstract: Neural architecture search (NAS) has been extensively studied in the past few years. A popular approach is to represent each neural architecture in the search space as a directed acyclic graph (DAG), and then search over all DAGs by encoding the adjacency matrix and list of operations as a set of hyperparameters. Recent work has demonstrated that even small changes to the way each architecture is encoded can have a significant effect on the performance of NAS algorithms. In this work, we present the first formal study on the effect of architecture encodings for NAS, including a theoretical grounding and an empirical study. First we formally define architecture encodings and give a theoretical characterization on the scalability of the encodings we study Then we identify the main encoding-dependent subroutines which NAS algorithms employ, running experiments to show which encodings work best with each subroutine for many popular algorithms. The experiments act as an ablation study for prior work, disentangling the algorithmic and encoding-based contributions, as well as a guideline for future work. Our results demonstrate that NAS encodings are an important design decision which can have a significant impact on overall performance. Our code is available at https://github.com/naszilla/nas-encodings.

Authors (4)

Colin White (34 papers)
Willie Neiswanger (68 papers)
Sam Nolen (2 papers)
Yash Savani (10 papers)

Citations (69)

View on Semantic Scholar

Summary

An Analysis of Encodings for Neural Architecture Search

The paper presents a comprehensive paper on the role of encodings in Neural Architecture Search (NAS), an important facet of designing effective NAS algorithms. The authors offer a formal framework, both theoretical and empirical, to analyze the impact of architecture encodings, which are commonly represented as Directed Acyclic Graphs (DAGs). The encodings examined play a crucial role in capturing the structural essence of these DAGs and consequently influence the performance of NAS algorithms.

In the context of this research, the authors define architecture encodings as mappings of neural architectures to real-valued tensors. Predominantly, encodings can be classified into two paradigms: adjacency matrix-based and path-based encodings. Adjacency matrix encodings include one-hot, categorical, and continuous variants. These typically encapsulate architecture information in a flattened adjacency matrix coupled with a list of operations. Path-based encodings, on the other hand, model architectures using input-to-output DAG paths and can be one-hot, categorical, or continuous, as well as truncated versions.

The core investigation centers around the scalability and performance of these encodings in conjunction with NAS. The authors propose that effective NAS encodings must be considered at a subroutine level, specifically in three major subroutines: sampling random architectures, perturbing architectures, and training predictor models. Results indicate that architecture encodings meaningfully affect the outcomes of NAS algorithms, underscoring the importance of tailoring specific encodings to different NAS tasks.

Key experimental findings are derived from NASBench-101 and NASBench-201 datasets. For instance, path-based encodings demonstrate superior performance when integrated with prediction subroutines, whereas adjacency matrix-based encodings excel in sampling and perturbation subroutines. Experiments also suggest that path-based encodings can be truncated with minimal information loss, a novel insight with significant implications for the efficiency of NAS algorithms.

The theoretical analysis presents the characterization of path encoding scalability through the introduction of a phase transition phenomenon at a specific path length threshold, reinforcing that path encodings retain effectiveness despite truncation. Conversely, adjacency matrix encodings are shown to be more susceptible to information loss upon truncation, highlighting a critical trade-off in encoding complexity versus informativeness.

Deploying this theoretical and empirical framework allows the authors to provide clearer distinctions between algorithmic and encoding-based contributions in NAS research. This decoupling offers strategic guidance for researchers in deciding optimal encoding schemes tailored to specific NAS subroutines.

Looking ahead, these insights open avenues for further research focusing on designing novel encoding paradigms, potentially leveraging the demonstrated benefits of encoding-specific enhancements. Moreover, the implications of these findings could extend to other domains where graph-structured data and architecture search are pertinent.

In concluding this discourse, this paper successfully navigates and elucidates the complexities involved in NAS encodings, augmenting the collective understanding within the research community and providing robust guidelines for the development and evaluation of forthcoming NAS methodologies.

PDF Markdown

A Study on Encodings for Neural Architecture Search (2007.04965v2)

Summary

An Analysis of Encodings for Neural Architecture Search

Related Papers

GitHub

YouTube