Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robustifying and Boosting Training-Free Neural Architecture Search (2403.07591v1)

Published 12 Mar 2024 in cs.LG

Abstract: Neural architecture search (NAS) has become a key component of AutoML and a standard tool to automate the design of deep neural networks. Recently, training-free NAS as an emerging paradigm has successfully reduced the search costs of standard training-based NAS by estimating the true architecture performance with only training-free metrics. Nevertheless, the estimation ability of these metrics typically varies across different tasks, making it challenging to achieve robust and consistently good search performance on diverse tasks with only a single training-free metric. Meanwhile, the estimation gap between training-free metrics and the true architecture performances limits training-free NAS to achieve superior performance. To address these challenges, we propose the robustifying and boosting training-free NAS (RoBoT) algorithm which (a) employs the optimized combination of existing training-free metrics explored from Bayesian optimization to develop a robust and consistently better-performing metric on diverse tasks, and (b) applies greedy search, i.e., the exploitation, on the newly developed metric to bridge the aforementioned gap and consequently to boost the search performance of standard training-free NAS further. Remarkably, the expected performance of our RoBoT can be theoretically guaranteed, which improves over the existing training-free NAS under mild conditions with additional interesting insights. Our extensive experiments on various NAS benchmark tasks yield substantial empirical evidence to support our theoretical results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. Zero-cost proxies for lightweight nas. In Proc. ICLR, 2020.
  2. Partial monitoring—classification, regret bounds, and algorithms. Mathematics of Operations Research, 39(4):967–997, 2014.
  3. One-shot video object segmentation. In Proc. CVPR, pp.  221–230, 2017.
  4. ProxylessNAS: Direct neural architecture search on target task and hardware. In Proc. ICLR, 2019.
  5. Learnable embedding space for efficient neural architecture compression. In Proc. ICLR, 2018.
  6. Online learning to rank with top-k feedback. The Journal of Machine Learning Research, 18(1):3599–3648, 2017.
  7. Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. In Proc. ICLR, 2021a.
  8. Stabilizing differentiable architecture search via perturbation-based regularization. In Proc. ICML, pp.  1554–1565, 2020.
  9. DrNAS: Dirichlet neural architecture search. In Proc. ICLR, 2021b.
  10. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proc. ICCV, pp.  1294–1303, 2019.
  11. A downsampled variant of imagenet as an alternative to the CIFAR datasets. arXiv preprint arXiv:1707.08819, 2017.
  12. DARTS-: Robustly stepping out of performance collapse without indicators. arXiv preprint arXiv:2009.01027, 2020.
  13. Bayesian optimization meets bayesian optimal stopping. In Proc. ICML, pp.  1496–1506, 2019.
  14. Federated bayesian optimization via thompson sampling. In Proc. NeurIPS, pp.  9687–9699, 2020.
  15. Sample-then-optimize batch neural thompson sampling. In Proc. NeurIPS, pp.  23331–23344, 2022.
  16. ImageNet: A large-scale hierarchical image database. In Proc. CVPR, pp.  248–255, 2009.
  17. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017.
  18. Xuanyi Dong and Yi Yang. Searching for a robust neural architecture in four GPU hours. In Proc. CVPR, pp.  1761–1770, 2019.
  19. Xuanyi Dong and Yi Yang. NAS-Bench-201: Extending the scope of reproducible neural architecture search. In Proc. ICLR, 2020.
  20. Transnas-bench-101: Improving transferability and generalizability of cross-task neural architecture search. In Proc. CVPR, pp.  5251–5260, 2021.
  21. Bohb: Robust and efficient hyperparameter optimization at scale. In Proc. ICML, pp.  1437–1446, 2018.
  22. Deep residual learning for image recognition. In Proc. CVPR, pp.  770–778, 2016.
  23. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
  24. Densely connected convolutional networks. In Proc. CVPR, pp.  2261–2269, 2017.
  25. Adam: A method for stochastic optimization. In Proc. ICLR, 2015.
  26. Information directed sampling for linear partial monitoring. In Proc. COLT, pp.  2328–2369, 2020.
  27. Nas-bench-suite-zero: Accelerating research on zero cost proxies. In Proc. NeurIPS Systems Datasets and Benchmarks Track, 2022.
  28. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
  29. Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84–90, 2017.
  30. Snip: Single-shot network pruning based on connection sensitivity. In Proc. ICLR, 2019.
  31. Random search and reproducibility for neural architecture search. In Proc. UAI, pp.  367–377, 2020.
  32. Zen-nas: A zero-shot nas for high-performance image recognition. In Proc. ICML, pp.  347–356, 2021.
  33. Progressive neural architecture search. In Proc. ECCV, pp.  19–35, 2018.
  34. DARTS: Differentiable architecture search. In Proc. ICLR, 2019.
  35. Neural architecture optimization. In Proc. NeurIPS, pp.  7827–7838, 2018.
  36. Neural architecture search without training. In Proc. ICML, pp.  7588–7598, 2021.
  37. Evaluating efficient performance estimators of neural architectures. In Proc. NeurIPS, pp.  12265–12277, 2021.
  38. Fernando Nogueira. Bayesian Optimization: Open source constrained global optimization tool for Python, 2014–. URL https://github.com/fmfn/BayesianOptimization.
  39. Efficient neural architecture search via parameters sharing. In Proc. ICML, pp.  4095–4104, 2018.
  40. Regularized evolution for image classifier architecture search. In Proc. AAAI, pp.  4780–4789, 2019.
  41. Proxybo: Accelerating neural architecture search via bayesian optimization with zero-cost proxies. In Proc. AAAI, pp.  9792–9801, 2023.
  42. Understanding architectures learnt by cell-based neural architecture search. In Proc. ICLR, 2019.
  43. NASI: Label- and data-agnostic neural architecture search at initialization. In Proc. ICLR, 2021.
  44. Neural ensemble search via bayesian sampling. In Proc. UAI, pp.  1803–1812, 2022a.
  45. Unifying and boosting gradient-based training-free neural architecture search. In Proc. NeurIPS, pp.  33001–33015, 2022b.
  46. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  47. Practical Bayesian optimization of machine learning algorithms. In Proc. NeurIPS, pp.  2960–2968, 2012.
  48. Gaussian process optimization in the bandit setting: No regret and experimental design. In Proc. ICML, pp.  1015–1022, 2010.
  49. Going deeper with convolutions. In Proc. CVPR, pp.  1–9, 2015.
  50. MnasNet: Platform-aware neural architecture search for mobile. In Proc. CVPR, pp.  2820–2828, 2019.
  51. Pruning neural networks without any data by iteratively conserving synaptic flow. In Proc. NeurIPS, pp.  6377–6389, 2020.
  52. Blockswap: Fisher-guided block substitution for network compression on a budget. In Proc. ICLR, 2020.
  53. Attention is All you Need. In Proc. NeurIPS, 2017.
  54. Picking winning tickets before training by preserving gradient flow. In Proc. ICLR, 2020.
  55. Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proc. AAAI, pp.  10293–10301, 2021a.
  56. How powerful are performance predictors in neural architecture search? In Proc. NeurIPS, pp.  28454–28469, 2021b.
  57. A deeper look at zero-cost proxies for lightweight nas. In ICLR Blog Track, 2022. URL https://iclr-blog-track.github.io/2022/03/25/zero-cost-proxies/. https://iclr-blog-track.github.io/2022/03/25/zero-cost-proxies/.
  58. Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinforcement learning, pp.  5–32, 1992.
  59. Shapley-nas: discovering operation contribution for neural architecture search. In Proc. CVPR, pp.  11892–11901, 2022.
  60. Efficient neural architecture search via proximal iterations. In Proc. AAAI, pp.  6664–6671, 2020.
  61. β𝛽\betaitalic_β-darts: Beta-decay regularization for differentiable architecture search. In Proc. CVPR, pp.  10864–10873, 2022.
  62. Evaluating the search phase of neural architecture search. In Proc. ICLR, 2019.
  63. Taskonomy: Disentangling task transfer learning. In Proc. CVPR, pp.  3712–3722, 2018.
  64. Understanding and robustifying differentiable architecture search. In Proc. ICLR, 2020.
  65. Gradsign: Model performance inference with theoretical insights. In Proc. ICLR, 2022.
  66. Econas: Finding proxies for economical neural architecture search. In Proc. CVPR, pp.  11396–11404, 2020.
  67. Zhi-Hua Zhou. Ensemble methods: foundations and algorithms. CRC press, 2012.
  68. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.
  69. Learning transferable architectures for scalable image recognition. In Proc. CVPR, pp.  8697–8710, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zhenfeng He (4 papers)
  2. Yao Shu (29 papers)
  3. Zhongxiang Dai (39 papers)
  4. Bryan Kian Hsiang Low (77 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets