Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 23 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 59 tok/s Pro
Kimi K2 190 tok/s Pro
GPT OSS 120B 446 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach (2403.11380v1)

Published 18 Mar 2024 in cs.CV

Abstract: Supernet is a core component in many recent Neural Architecture Search (NAS) methods. It not only helps embody the search space but also provides a (relative) estimation of the final performance of candidate architectures. Thus, it is critical that the top architectures ranked by a supernet should be consistent with those ranked by true performance, which is known as the order-preserving ability. In this work, we analyze the order-preserving ability on the whole search space (global) and a sub-space of top architectures (local), and empirically show that the local order-preserving for current two-stage NAS methods still need to be improved. To rectify this, we propose a novel concept of Supernet Shifting, a refined search strategy combining architecture searching with supernet fine-tuning. Specifically, apart from evaluating, the training loss is also accumulated in searching and the supernet is updated every iteration. Since superior architectures are sampled more frequently in evolutionary searching, the supernet is encouraged to focus on top architectures, thus improving local order-preserving. Besides, a pre-trained supernet is often un-reusable for one-shot methods. We show that Supernet Shifting can fulfill transferring supernet to a new dataset. Specifically, the last classifier layer will be unset and trained through evolutionary searching. Comprehensive experiments show that our method has better order-preserving ability and can find a dominating architecture. Moreover, the pre-trained supernet can be easily transferred into a new dataset with no loss of performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. EZNAS: evolving zero-cost proxies for neural architecture scoring. In NeurIPS, 2022.
  2. Designing neural network architectures using reinforcement learning. In ICLR, 2017.
  3. Understanding and simplifying one-shot architecture search. In ICML, pages 549–558, 2018.
  4. Proxylessnas: Direct neural architecture search on target task and hardware. In ICLR, 2019.
  5. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In ICCV, pages 1294–1303, 2019.
  6. Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search. In ICCV, pages 12219–12228, 2021.
  7. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248–255, 2009.
  8. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL, pages 4171–4186, 2019.
  9. Searching for a robust neural architecture in four GPU hours. In CVPR, pages 1761–1770, 2019.
  10. Densely connected search space for more flexible neural architecture search. In CVPR, pages 10625–10634, 2020.
  11. Single path one-shot neural architecture search with uniform sampling. In ECCV, pages 544–560, 2020.
  12. SUMNAS: supernet with unbiased meta-features for neural architecture search. In ICLR, 2022.
  13. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
  14. Masked autoencoders are scalable vision learners. In CVPR, pages 15979–15988, 2022.
  15. Dropnas: Grouped operation dropout for differentiable architecture search. CoRR, abs/2201.11679, 2022.
  16. Generalizing few-shot NAS with gradient matching. In ICLR, 2022.
  17. Learning multiple layers of features from tiny images. 2009.
  18. Snip: single-shot network pruning based on connection sensitivity. In ICLR. OpenReview.net, 2019.
  19. Progressive neural architecture search. CoRR, abs/1712.00559, 2017.
  20. DARTS: differentiable architecture search. In ICLR, 2019.
  21. Pa&da: Jointly sampling path and data for consistent NAS. In CVPR, pages 11940–11949, 2023.
  22. Shufflenet V2: practical guidelines for efficient CNN architecture design. In ECCV, pages 122–138, 2018.
  23. Neural architecture search without training. In ICML, pages 7588–7598, 2021.
  24. Efficient neural architecture search via parameter sharing. In ICML, pages 4092–4101, 2018.
  25. Large-scale evolution of image classifiers. In ICML, pages 2902–2911, 2017.
  26. Transfer NAS with meta-learned bayesian surrogates. In ICLR, 2023.
  27. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
  28. Single-path NAS: designing hardware-efficient convnets in less than 4 hours. In ECML, pages 481–497, 2019.
  29. K-shot NAS: learnable weight-sharing for NAS with k-shot supernets. In ICML, pages 9880–9890, 2021.
  30. Picking winning tickets before training by preserving gradient flow. In ICLR. OpenReview.net, 2020.
  31. Attentivenas: Improving neural architecture search via attentive sampling. In CVPR, pages 6418–6427, 2021.
  32. SNAS: stochastic neural architecture search. In ICLR, 2019.
  33. Analyzing and mitigating interference in neural architecture search. In ICML, pages 24646–24662, 2022.
  34. Transferable automl by model sharing over grouped datasets. In CVPR, pages 9002–9011, 2019.
  35. Partial connection based on channel attention for differentiable neural architecture search. IEEE Trans. Ind. Informatics, 19(5):6804–6813, 2023.
  36. Greedynas: Towards fast one-shot NAS with greedy supernet. In CVPR, pages 1996–2005, 2020.
  37. Overcoming multi-model forgetting in one-shot nas with diversity maximization. In CVPR, pages 7806–7815, 2020.
  38. Few-shot neural architecture search. In ICML, pages 12707–12718, 2021.
  39. Neural architecture search with reinforcement learning. In ICLR, 2017.
  40. Learning transferable architectures for scalable image recognition. In CVPR, pages 8697–8710, 2018.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.