Papers
Topics
Authors
Recent
Search
2000 character limit reached

Automated Search-Space Generation Neural Architecture Search

Published 25 May 2023 in cs.LG, cs.AI, and cs.CV | (2305.18030v3)

Abstract: To search an optimal sub-network within a general deep neural network (DNN), existing neural architecture search (NAS) methods typically rely on handcrafting a search space beforehand. Such requirements make it challenging to extend them onto general scenarios without significant human expertise and manual intervention. To overcome the limitations, we propose Automated Search-Space Generation Neural Architecture Search (ASGNAS), perhaps the first automated system to train general DNNs that cover all candidate connections and operations and produce high-performing sub-networks in the one shot manner. Technologically, ASGNAS delivers three noticeable contributions to minimize human efforts: (i) automated search space generation for general DNNs; (ii) a Hierarchical Half-Space Projected Gradient (H2SPG) that leverages the hierarchy and dependency within generated search space to ensure the network validity during optimization, and reliably produces a solution with both high performance and hierarchical group sparsity; and (iii) automated sub-network construction upon the H2SPG solution. Numerically, we demonstrate the effectiveness of ASGNAS on a variety of general DNNs, including RegNet, StackedUnets, SuperResNet, and DARTS, over benchmark datasets such as CIFAR10, Fashion-MNIST, ImageNet, STL-10 , and SVNH. The sub-networks computed by ASGNAS achieve competitive even superior performance compared to the starting full DNNs and other state-of-the-arts. The library will be released at https://github.com/tianyic/only_train_once.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332, 2018.
  2. Automatic generation of neural architecture search spaces. In Combining Learning and Reasoning: Programming Languages, Formalisms, and Representations, 2022.
  3. A half-space stochastic projected gradient method for group sparsity regularization. 2020.
  4. Only train once: A one-shot neural network training and pruning framework. In Advances in Neural Information Processing Systems, 2021a.
  5. Otov2: Automatic, generic, user-friendly. In The Eleventh International Conference on Learning Representations, 2023.
  6. Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. arXiv preprint arXiv:2102.11535, 2021b.
  7. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  1294–1303, 2019.
  8. Progressive darts: Bridging the optimization gap for nas in the wild. International Journal of Computer Vision, 129:638–655, 2021c.
  9. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp.  215–223. JMLR Workshop and Conference Proceedings, 2011.
  10. An adaptive half-space projection method for stochastic optimization problems with group sparse regularization. Transactions on Machine Learning Research, 2023.
  11. Structured sparsity inducing adaptive optimizers for deep learning. arXiv preprint arXiv:2102.03869, 2021.
  12. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp.  248–255. Ieee, 2009.
  13. Sparsity-guided network design for frame interpolation. arXiv preprint arXiv:2209.04551, 2022.
  14. Efficient multi-objective neural architecture search via lamarckian evolution. arXiv preprint arXiv:1804.09081, 2018.
  15. Deep learning, volume 1. MIT press Cambridge, 2016.
  16. Milenas: Efficient neural architecture search via mixed-level reformulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  11993–12002, 2020.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
  18. Saliency-aware neural architecture search. Advances in Neural Information Processing Systems, 35:14743–14757, 2022.
  19. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  20. A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto, 2009.
  21. Deep learning. nature, 521(7553):436–444, 2015.
  22. Zico: Zero-shot nas via inverse coefficient of variation on gradients. arXiv preprint arXiv:2301.11300, 2023.
  23. Zen-nas: A zero-shot nas for high-performance deep image recognition. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, 2021.
  24. Toward compact convnets via structure-sparsity regularized filter pruning. IEEE transactions on neural networks and learning systems, 31(2):574–588, 2019.
  25. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018.
  26. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV), pp.  116–131, 2018.
  27. Automated super-network generation for scalable neural architecture search. In First Conference on Automated Machine Learning (Main Track), 2022. URL https://openreview.net/forum?id=HK-zmbTB8gq.
  28. Reading digits in natural images with unsupervised feature learning. 2011.
  29. Efficient neural architecture search via parameters sharing. In International conference on machine learning, pp. 4095–4104. PMLR, 2018.
  30. Designing network design spaces. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  10428–10436, 2020.
  31. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pp.  4780–4789, 2019.
  32. Interpretations are useful: penalizing explanations to align neural networks with prior knowledge. In International conference on machine learning, pp. 8116–8126. PMLR, 2020.
  33. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pp.  234–241. Springer, 2015.
  34. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  1–9, 2015.
  35. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  2820–2828, 2019.
  36. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  37. Nas-unet: Neural architecture search for medical image segmentation. IEEE access, 7:44247–44257, 2019.
  38. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017.
  39. A proximal stochastic gradient method with progressive variance reduction. SIAM Journal on Optimization, 24(4):2057–2075, 2014.
  40. Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737, 2019.
  41. Ista-nas: Efficient and consistent neural architecture search by sparse coding. Advances in Neural Information Processing Systems, 33:10503–10513, 2020.
  42. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412, 2017.
  43. The composite absolute penalties family for grouped and hierarchical variable selection. 2009.
  44. Autospace: Neural architecture search with less human interference. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  337–346, 2021.
  45. Theory-inspired path-regularized differential network architecture search. Advances in Neural Information Processing Systems, 33:8296–8307, 2020.
  46. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.
  47. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  8697–8710, 2018.
Citations (1)

Summary

  • The paper introduces HSPG, a framework automating search-space generation in NAS for efficient sub-network construction.
  • It employs a graph-based algorithm and H2SPG optimizer to automatically identify and remove redundant network structures.
  • Results show HSPG delivers compact models with competitive performance across diverse architectures and datasets.

The paper "Automated Search-Space Generation Neural Architecture Search" presents HSPG, a novel framework that pioneers the automation of search-space generation and optimization in Neural Architecture Search (NAS). Traditional NAS frameworks necessitate extensive human expertise to define and explore different neural architectures, which limits their scalability and adaptability across diverse tasks. HSPG, however, reduces manual intervention and enables end-to-end automatic sub-network generation within a given deep neural network (DNN).

The core contributions of this paper are multifold:

  1. Automated Search Space Generation: The authors develop a graph-based algorithm that automatically constructs a search space from any general DNN. This algorithm identifies potential redundant structures in the network without disrupting the functionality of the remaining architecture. By representing these as a segment graph, HSPG efficiently delineates the operations and connections within a neural network that are amenable to removal or modification.
  2. Hierarchical Half-Space Projected Gradient (H2SPG) Algorithm: H2SPG is introduced as an innovative optimizer specifically designed for handling hierarchical structured sparsity problems in DNNs. It leverages hierarchical dependencies within the search space to ensure that resulting sub-networks remain valid and performant. This optimizer identifies and zeroes out redundant components while maintaining critical parts, thus achieving a balance between performance and model compactness.
  3. Automated Sub-Network Construction: Building upon the solution obtained from H2SPG, HSPG automatically constructs a more compact sub-network. This process not only removes redundant structures but also reconfigures dependent modules to ensure seamless operation of the simplified network.

The numerical results presented in the paper illustrate the efficacy of HSPG across a variety of neural architectures, including RegNet, StackedUnets, SuperResNet, and DARTS, evaluated on baseline datasets such as CIFAR10, Fashion-MNIST, ImageNet, STL-10, and SVHN. The automated sub-networks generated by HSPG exhibit competitive, if not superior, performance relative to their larger parent networks and other state-of-the-art models. For instance, in experiments on the StackedUnets, HSPG successfully minimizes the network parameters while slightly improving the top-1 accuracy over the original architecture.

Implications and Future Directions

This work represents a significant step forward in the NAS domain, showcasing that the automation of search space generation and optimization can be achieved effectively. By minimizing the need for human intervention, HSPG opens up possibilities for deploying neural architecture search in applications where rapid adaptation to new or varied tasks is essential. Cybersecurity, autonomous driving, and real-time data processing are potential areas that could benefit from such automated NAS systems.

In the theoretical landscape, the introduction of a hierarchical approach within optimization provides a robust framework that could influence future algorithmic research in sparse optimization and neural network interpretability.

As for future developments, the scope for improving HSPG lies in further enhancing its computational efficiency and applicability to broader classes of neural networks inclusive of non-trainable operations. Moreover, as the current framework leverages a single-shot search, integrating multi-level optimization features, similar to existing approaches yet automated, could extend the framework's functionality. Addressing these areas will not only enhance the practical utility of the HSPG but may also contribute to more generalized advancements in neural architecture search technologies.

In conclusion, the paper presents a thoughtful and significant advancement in the field of NAS, providing tools and paradigms that could reshape current practices in search space exploration and automated architecture design.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.