Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Survey on Neural Architecture Search

Published 4 May 2019 in cs.LG, cs.CV, cs.NE, and stat.ML | (1905.01392v2)

Abstract: The growing interest in both the automation of machine learning and deep learning has inevitably led to the development of a wide variety of automated methods for neural architecture search. The choice of the network architecture has proven to be critical, and many advances in deep learning spring from its immediate improvements. However, deep learning techniques are computationally intensive and their application requires a high level of domain knowledge. Therefore, even partial automation of this process helps to make deep learning more accessible to both researchers and practitioners. With this survey, we provide a formalism which unifies and categorizes the landscape of existing methods along with a detailed analysis that compares and contrasts the different approaches. We achieve this via a comprehensive discussion of the commonly adopted architecture search spaces and architecture optimization algorithms based on principles of reinforcement learning and evolutionary algorithms along with approaches that incorporate surrogate and one-shot models. Additionally, we address the new research directions which include constrained and multi-objective architecture search as well as automated data augmentation, optimizer and activation function search.

Citations (250)

Summary

  • The paper provides a comprehensive review of NAS methods, detailing the categorization of global and cell-based search spaces.
  • The paper evaluates optimization techniques such as reinforcement learning, evolutionary algorithms, surrogate models, and one-shot search.
  • The paper highlights how automating neural network design reduces computational costs while enhancing model performance for practical applications.

Overview of "A Survey on Neural Architecture Search"

The paper "A Survey on Neural Architecture Search" authored by Martin Wistuba, Ambrish Rawat, and Tejaswini Pedapati provides a comprehensive review of the research efforts in neural architecture search (NAS). This exploration serves as a critical examination of the automation in designing neural network structures, addressing computational efficiency and optimization challenges across various methodologies. The paper categorizes existing NAS methods, examines the commonly adopted architecture search spaces, and details optimization algorithms based on reinforcement learning and evolutionary models.

Key Insights and Methodologies

Architecture Search Spaces: The survey identifies the design of search spaces as pivotal to NAS, categorizing them primarily into global and cell-based search spaces. Initial efforts focused on chain-structured architectures but have evolved towards more sophisticated layouts, including branched architectures, which outperform classical networks.

  • Global Search Space: Allows for comprehensive freedom in the arrangement of operations but can be computationally demanding. Techniques that use template-constrained spaces to simplify the search are highlighted.
  • Cell-Based Search Space: This involves repeating a specific structure (cell/block) across the network and has gained popularity due to ease of transfer across datasets and tasks. The NASNet architecture is a significant example of this category, emphasizing the architectural design's efficiency.

Optimization Methods: The paper delineates four main approaches to NAS optimization: reinforcement learning, evolutionary algorithms, surrogate model-based, and one-shot architecture search.

  • Reinforcement Learning (RL): Primarily employed at the inception of NAS research, leveraging agents that learn policies to construct architectures via sequential decision-making. While methods using RL, like Q-learning and policy gradients, have been successful, their computational cost remains a drawback.
  • Evolutionary Algorithms (EA): These mimic natural selection processes for optimizing neural architectures. EA-based methods explore search spaces by using mutations and selections, albeit at the cost of significant computational resources.
  • Surrogate Models: Significantly reduce search duration by approximating the performance of architectures without exhaustive training. Surrogates are often used in conjunction with Bayesian optimization frameworks, providing a balance between exploration and exploitation.
  • One-Shot Architecture Search: Address computational inefficiencies by training an over-parameterized network to represent the entire search space, allowing for efficient architecture evaluation through weight sharing.

Practical and Theoretical Implications

The implications of NAS are profound, extending beyond theoretical interest to practical applications. The methodologies surveyed provide a toolkit for automating and optimizing neural network design, crucial for deploying AI models efficiently across various domains such as image recognition and language processing. NAS techniques cater to reducing computational costs while maintaining model performance, which is imperative for real-world applications, especially in resource-constrained environments.

Future Directions

The landscape of NAS continues to evolve with promising avenues such as the incorporation of transfer learning, which aims to leverage pre-trained architectures for diverse tasks, and multi-objective optimization, considering constraints like memory and inference time. There is a growing need for integrated solutions that encompass the entire machine learning pipeline, from data preprocessing through to model deployment.

Furthermore, addressing the gap between automated search efficiency and model interpretability is a crucial challenge. Future research is expected to advance toward architectures that not only perform well but also enhance our understanding of learning processes and neural computations.

In conclusion, this survey paper anchors nascent and seasoned researchers in the ongoing development and exploration of NAS, advocating for continued innovation to bridge existing gaps and extend the applicability of deep learning models across various technological and industrial frontiers.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.