Papers
Topics
Authors
Recent
Search
2000 character limit reached

Neural Architecture Search: Two Constant Shared Weights Initialisations

Published 9 Feb 2023 in cs.LG and cs.AI | (2302.04406v3)

Abstract: In the last decade, zero-cost metrics have gained prominence in neural architecture search (NAS) due to their ability to evaluate architectures without training. These metrics are significantly faster and less computationally expensive than traditional NAS methods and provide insights into neural architectures' internal workings. This paper introduces epsinas, a novel zero-cost NAS metric that assesses architecture potential using two constant shared weight initialisations and the statistics of their outputs. We show that the dispersion of raw outputs, normalised by their average magnitude, strongly correlates with trained accuracy. This effect holds across image classification and language tasks on NAS-Bench-101, NAS-Bench-201, and NAS-Bench-NLP. Our method requires no data labels, operates on a single minibatch, and eliminates the need for gradient computation, making it independent of training hyperparameters, loss metrics, and human annotations. It evaluates a network in a fraction of a GPU second and integrates seamlessly into existing NAS frameworks. The code supporting this study can be found on GitHub at https://github.com/egracheva/epsinas.

Authors (1)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Zero-cost proxies for lightweight nas. arXiv preprint arXiv:2101.08134, 2021.
  2. Abien Fred Agarap. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375, 2018.
  3. Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. arXiv preprint arXiv:2102.11535, 2021.
  4. A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819, 2017.
  5. Darts-: robustly stepping out of performance collapse without indicators. arXiv preprint arXiv:2009.01027, 2020.
  6. Peephole: Predicting network performance before training. arXiv preprint arXiv:1712.03351, 2017.
  7. Xuanyi Dong and Yi Yang. Searching for a robust neural architecture in four gpu hours. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1761–1770, 2019.
  8. Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326, 2020.
  9. Bohb: Robust and efficient hyperparameter optimization at scale. arXiv preprint arXiv:1807.01774, 2018.
  10. Weight agnostic neural networks. Advances in neural information processing systems, 32, 2019.
  11. Ekaterina Gracheva. Trainless model performance estimation based on random weights initialisations for neural architecture search. Array, 12:100082, 2021. ISSN 2590-0056. doi:https://doi.org/10.1016/j.array.2021.100082.
  12. Tapas: Train-less accuracy predictor for architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 3927–3934, 2019.
  13. Nas-bench-nlp: neural architecture search benchmark for natural language processing. IEEE Access, 10:45736–45747, 2022.
  14. Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto, 2009.
  15. Mnist handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2, 2010.
  16. Snip: Single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340, 2018.
  17. Random search and reproducibility for neural architecture search. In Uncertainty in artificial intelligence, pages 367–377. PMLR, 2020.
  18. Mary Ann Marcinkiewicz. Building a large annotated corpus of english: The penn treebank. Using Large Corpora, 273, 1994.
  19. Neural architecture search without training. arXiv preprint arXiv:2006.04647v1, 2020.
  20. Pervasive label errors in test sets destabilize machine learning benchmarks. arXiv preprint arXiv:2103.14749, 2021.
  21. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268, 2018.
  22. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pages 4780–4789, 2019.
  23. Pruning neural networks without any data by iteratively conserving synaptic flow. Advances in Neural Information Processing Systems, 33:6377–6389, 2020.
  24. Picking winning tickets before training by preserving gradient flow. arXiv preprint arXiv:2002.07376, 2020.
  25. Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 10293–10301, 2021.
  26. Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3):229–256, 1992.
  27. Nas-bench-101: Towards reproducible neural architecture search. In International Conference on Machine Learning, pages 7105–7114. PMLR, 2019.
  28. Gradsign: Model performance inference with theoretical insights. arXiv preprint arXiv:2110.08616, 2021.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.