Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sample-Efficient Neural Architecture Search by Learning Action Space (1906.06832v2)

Published 17 Jun 2019 in cs.LG, cs.CV, and stat.ML

Abstract: Neural Architecture Search (NAS) has emerged as a promising technique for automatic neural network design. However, existing MCTS based NAS approaches often utilize manually designed action space, which is not directly related to the performance metric to be optimized (e.g., accuracy), leading to sample-inefficient explorations of architectures. To improve the sample efficiency, this paper proposes Latent Action Neural Architecture Search (LaNAS), which learns actions to recursively partition the search space into good or bad regions that contain networks with similar performance metrics. During the search phase, as different action sequences lead to regions with different performance, the search efficiency can be significantly improved by biasing towards the good regions. On three NAS tasks, empirical results demonstrate that LaNAS is at least an order more sample efficient than baseline methods including evolutionary algorithms, Bayesian optimizations, and random search. When applied in practice, both one-shot and regular LaNAS consistently outperform existing results. Particularly, LaNAS achieves 99.0% accuracy on CIFAR-10 and 80.8% top1 accuracy at 600 MFLOPS on ImageNet in only 800 samples, significantly outperforming AmoebaNet with 33x fewer samples. Our code is publicly available at https://github.com/facebookresearch/LaMCTS.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Linnan Wang (18 papers)
  2. Saining Xie (60 papers)
  3. Teng Li (83 papers)
  4. Rodrigo Fonseca (23 papers)
  5. Yuandong Tian (128 papers)
Citations (44)

Summary

  • The paper introduces LaNAS, a method that learns latent actions to partition the search space and improve sample efficiency in NAS.
  • It achieves notable performance with 99.0% accuracy on CIFAR-10 and 80.8% top-1 accuracy on ImageNet using only 800 samples.
  • The approach integrates a learning phase with MCTS to dynamically balance exploration and exploitation in neural architecture search.

Sample-Efficient Neural Architecture Search by Learning Actions for Monte Carlo Tree Search

The paper introduces Latent Action Neural Architecture Search (LaNAS), a novel approach aimed at improving the sample efficiency of Neural Architecture Search (NAS) using Monte Carlo Tree Search (MCTS). Unlike previous methods that often depend on manually crafted action spaces, LaNAS innovatively learns actions to bifurcate the search space into regions of varying performance. This approach addresses the inefficiencies of conventional exploration strategies that are not inherently linked to the performance metrics being optimized.

Key Contributions

The core contribution of this work is the learning-based action space that enhances the search efficiency within MCTS. The solution involves partitioning the search space (Ω\Omega) into distinct regions (Ωj\Omega_j) that group neural networks with similar performance metrics. This partitioning is achieved recursively, allowing the search to bias towards highly promising regions early in the process, thus significantly improving sample efficiency.

Strong Numerical Results

The empirical results are notable. LaNAS achieves 99.0% accuracy on CIFAR-10 and 80.8% top1 accuracy on ImageNet using only 800 samples, outperforming established methods like AmoebaNet by 33×33\times fewer samples. Such results underscore the enhanced sample efficiency and the capability to provide state-of-the-art accuracy with reduced computational resources.

Methodological Insights

LaNAS iterates between learning and search phases. In the learning phase, linear regressors are employed to define latent actions that separate the search space into high-performing and low-performing regions. This iterative process creates a hierarchical tree structure, with the most promising regions forming the leftmost paths. The search phase then leverages MCTS to sample architectures adaptively, combining exploitation of known good regions with exploration of less sampling regions.

Implications and Future Directions

The implications of LaNAS extend into both theoretical and practical domains:

  • Theoretical Implications: The approach suggests a paradigm shift in NAS by emphasizing the importance of learned action spaces. It shifts away from the reliance on pre-defined spaces, proposing a dynamic adjustment to suit specific performance metrics.
  • Practical Implications: LaNAS enhances the efficiency of NAS in real-world applications, making it feasible to conduct accurate architecture searches with significantly fewer computational resources. This opens pathways for more frequent updates to existing models and more rapid experimentation cycles.

Future developments in AI could see further integration of LaNAS within broader machine learning and NAS frameworks. Enhancing the adaptability of NAS through learned, application-specific action spaces could significantly bolster the flexibility and performance of AI models in various domains.

Conclusion

This paper presents a significant advancement in optimizing NAS through LaNAS, which learns latent actions to enhance the efficiency of MCTS. By creating an ordered search paradigm linked directly to performance metrics, LaNAS demonstrates improved sample efficiency and accuracy, suggesting expansive potential for AI research and application. The proposed methodology could serve as a foundation for future innovations, expanding the capabilities and scope of neural architecture optimization.