PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search (1907.05737v4)

Published 12 Jul 2019 in cs.CV and cs.LG

Abstract: Differentiable architecture search (DARTS) provided a fast solution in finding effective network architectures, but suffered from large memory and computing overheads in jointly training a super-network and searching for an optimal architecture. In this paper, we present a novel approach, namely, Partially-Connected DARTS, by sampling a small part of super-network to reduce the redundancy in exploring the network space, thereby performing a more efficient search without comprising the performance. In particular, we perform operation search in a subset of channels while bypassing the held out part in a shortcut. This strategy may suffer from an undesired inconsistency on selecting the edges of super-net caused by sampling different channels. We alleviate it using edge normalization, which adds a new set of edge-level parameters to reduce uncertainty in search. Thanks to the reduced memory cost, PC-DARTS can be trained with a larger batch size and, consequently, enjoys both faster speed and higher training stability. Experimental results demonstrate the effectiveness of the proposed method. Specifically, we achieve an error rate of 2.57% on CIFAR10 with merely 0.1 GPU-days for architecture search, and a state-of-the-art top-1 error rate of 24.2% on ImageNet (under the mobile setting) using 3.8 GPU-days for search. Our code has been made available at: https://github.com/yuhuixu1993/PC-DARTS.

Citations (569)

View on Semantic Scholar

Summary

The paper introduces a novel partial channel sampling method that reduces memory use and computational cost for architecture search.
It employs edge normalization to stabilize the stochastic search process and enhance training speed.
Experimental results show 2.57% error on CIFAR10 in 0.1 GPU-days and 24.2% top-1 error on ImageNet in 3.8 GPU-days.

Overview of PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search

PC-DARTS presents an innovative approach to differentiable architecture search (DARTS), addressing significant memory and computation inefficiencies inherent in traditional methods. The primary motivation is to enhance the efficiency of architecture search in neural networks while maintaining or improving performance.

Key Contributions

The paper introduces Partially-Connected DARTS (PC-DARTS), a novel methodology that reduces the redundancy in network search without sacrificing performance. The salient contributions are:

Partial Channel Sampling: Instead of evaluating all channels in the network, a random subset of channels is sampled at each step. This reduces the memory and computational burden, allowing for larger batch sizes, hence increasing training stability and speed.
Edge Normalization: To counteract inconsistencies caused by differing channel selections in the partially connected network, edge normalization is proposed. It introduces additional parameters at the edge level to lower uncertainty in architecture search.

Numerical Results

PC-DARTS demonstrates its efficacy with strong experimental results:

Achieves an impressive error rate of 2.57% on CIFAR10 with just 0.1 GPU-days.
On ImageNet, PC-DARTS records a state-of-the-art top-1 error rate of 24.2% under the mobile setting, requiring only 3.8 GPU-days.

These results highlight a marked improvement over prior DARTS methods, particularly in reducing search time.

Implications and Future Directions

The implications of PC-DARTS are substantial:

Practical Implications: The enhanced efficiency in architecture search makes PC-DARTS attractive for real-world applications, where computational resources are often a limiting factor.
Theoretical Implications: This work suggests a shift in perspective regarding the dependency on full channel evaluations in DARTS. By demonstrating that a subset selection can yield robust architectures, PC-DARTS invites further exploration into stochastic methods in network architecture search.
Future Developments: As a pioneering strategy in channel sampling within NAS, PC-DARTS opens avenues for integrating similar strategies into other search algorithms. Future research might delve into optimal sampling strategies and how these affect the broader NAS performance landscape.

Conclusion

PC-DARTS offers a significant step forward in the field of neural architecture search by explicitly tackling the inefficiencies in computation and memory usage. Through partial channel connections and edge normalization, it establishes a new standard for efficiency and stability in NAS, paving the way for future explorations in the stochastic and reduced complexity approaches to neural architecture design.

PDF Markdown

Related Papers

GitHub

GitHub - yuhuixu1993/PC-DARTS: PC-DARTS:Partial Channel Connections for Memory-Efficient Differentiable Architecture Search (436 stars)