Single Path One-Shot Neural Architecture Search with Uniform Sampling (1904.00420v4)

Published 31 Mar 2019 in cs.CV

Abstract: We revisit the one-shot Neural Architecture Search (NAS) paradigm and analyze its advantages over existing NAS approaches. Existing one-shot method, however, is hard to train and not yet effective on large scale datasets like ImageNet. This work propose a Single Path One-Shot model to address the challenge in the training. Our central idea is to construct a simplified supernet, where all architectures are single paths so that weight co-adaption problem is alleviated. Training is performed by uniform path sampling. All architectures (and their weights) are trained fully and equally. Comprehensive experiments verify that our approach is flexible and effective. It is easy to train and fast to search. It effortlessly supports complex search spaces (e.g., building blocks, channel, mixed-precision quantization) and different search constraints (e.g., FLOPs, latency). It is thus convenient to use for various needs. It achieves start-of-the-art performance on the large dataset ImageNet.

Citations (893)

View on Semantic Scholar

Summary

The paper introduces a novel Single Path One-Shot NAS that reduces training complexities and weight coupling compared to traditional NAS methods.
Its uniform sampling strategy ensures every architecture in the search space is trained equally, leading to improved efficiency and predictive accuracy on ImageNet.
An evolutionary algorithm efficiently navigates the expansive search space, outperforming state-of-the-art methods in both speed and resource use.

Single Path One-Shot Neural Architecture Search with Uniform Sampling

The paper "Single Path One-Shot Neural Architecture Search with Uniform Sampling" revisits and enhances the one-shot Neural Architecture Search (NAS) paradigm to alleviate training complexities and inefficiencies prevalent in existing NAS approaches. The proposed method leverages a simplified supernet architecture and novel training strategies to streamline the search and evaluation process of neural network architectures, particularly for large-scale datasets like ImageNet.

Key Contributions

The paper makes several noteworthy contributions to the domain of NAS:

Principled Analysis of NAS Approaches: The authors provide a comprehensive overview of existing NAS methods and their limitations, particularly focusing on issues of weight coupling in supernets and the computational inefficiencies of nested architecture searches.
Single Path Supernet: To address weight coupling, the authors introduce a supernet structure where each neural network architecture corresponds to a single path. This reduces the complexity and interdependencies among weights, making the training process more straightforward and efficient.
Uniform Path Sampling: The paper proposes a uniform sampling strategy that ensures all architectures in the search space are trained fully and equally. This stochastic training approach replaces complex hyperparameter tuning techniques used in previous weight-sharing methods.
Novel Choice Blocks: The authors extend the flexibility and richness of the search space by introducing choice blocks for channel number search and mixed-precision quantization. These blocks allow for more granular and efficient searches within constrained architectures.
Evolutionary Algorithm for Architecture Search: The search process employs an evolutionary algorithm instead of random search, which is shown to be much more efficient in exploring large search spaces and achieving state-of-the-art results on ImageNet.

Experimental Verification and Results

The experimental section provides a thorough evaluation of the proposed method using the ImageNet dataset. Key findings are summarized as follows:

Flexibility and Performance: The Single Path One-Shot (SPOS) approach supports complex search spaces, including building blocks and channels for mixed-precision quantization, while being adaptable to different constraints such as FLOPs and latency. This flexibility is crucial for real-world applications.
Comparison with State-of-the-Art Methods: The method is benchmarked against leading NAS methods like ProxylessNAS and FBNet, demonstrating superior accuracy and efficiency. Notably, when applied to the same search space, SPOS outperforms these methods, substantiating the efficacy of the single path and uniform sampling strategies.
Search Efficiency: The time and resource cost associated with SPOS is significantly lower compared to other methods due to the efficient training and evaluation pipeline. Specifically, the memory consumption during supernet training is minimized, and the search process is streamlined through evolutionary algorithms.
Correlation Analysis: Extensive correlation analysis based on NAS-Bench-201 demonstrates that the validation accuracy of architectures using the supernet-weight inheritance is predictive of their final performance when fully trained, validating the stochastic training and sampling strategy.

Practical and Theoretical Implications

The implications of this research are multifaceted:

Practically, the proposed method enables faster, more flexible, and resource-efficient NAS, making it feasible to deploy state-of-the-art neural architectures in real-time applications with varying computational constraints.
Theoretically, the approach shifts the paradigm from deeply coupled, hyperparameter-heavy methods to a more streamlined, stochastic training process that preserves the integrity of the weight-sharing strategy without incurring excessive computational costs.

Future Developments

Future work may expand on the principles introduced in this paper to explore further enhancements in supernet architectures, sampling strategies, and search algorithms. Additionally, validating the method across an even broader range of datasets and applications will help solidify its generalizability and robustness.

In summary, this paper presents a significant advancement in the field of NAS by introducing an innovative single path one-shot approach with uniform sampling, offering a more effective and efficient solution to neural architecture optimization.

PDF Markdown

Related Papers

YouTube

Show All Videos