Papers
Topics
Authors
Recent
Search
2000 character limit reached

SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

Published 7 Mar 2024 in cs.LG, cs.CV, and cs.NE | (2403.04161v5)

Abstract: Training-free metrics (a.k.a. zero-cost proxies) are widely used to avoid resource-intensive neural network training, especially in Neural Architecture Search (NAS). Recent studies show that existing training-free metrics have several limitations, such as limited correlation and poor generalisation across different search spaces and tasks. Hence, we propose Sample-Wise Activation Patterns and its derivative, SWAP-Score, a novel high-performance training-free metric. It measures the expressivity of networks over a batch of input samples. The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks, outperforming 15 existing training-free metrics on NAS-Bench-101/201/301 and TransNAS-Bench-101. The SWAP-Score can be further enhanced by regularisation, which leads to even higher correlations in cell-based search space and enables model size control during the search. For example, Spearman's rank correlation coefficient between regularised SWAP-Score and CIFAR-100 validation accuracies on NAS-Bench-201 networks is 0.90, significantly higher than 0.80 from the second-best metric, NWOT. When integrated with an evolutionary algorithm for NAS, our SWAP-NAS achieves competitive performance on CIFAR-10 and ImageNet in approximately 6 minutes and 9 minutes of GPU time respectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Zero-cost proxies for lightweight NAS. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021.
  2. Proxylessnas: Direct neural architecture search on target task and hardware. CoRR, abs/1812.00332, 2018.
  3. Neural architecture search on imagenet in four GPU hours: A theoretically inspired perspective. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021a.
  4. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
  5. Contrastive neural architecture search with neural architecture comparators. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pp.  9502–9511. Computer Vision Foundation / IEEE, 2021b.
  6. A downsampled variant of imagenet as an alternative to the CIFAR datasets. CoRR, abs/1707.08819, 2017.
  7. Fair DARTS: eliminating unfair advantages in differentiable architecture search. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XV, volume 12360, pp.  465–480. Springer, 2020.
  8. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA, pp.  248–255. IEEE Computer Society, 2009.
  9. Xuanyi Dong and Yi Yang. Searching for a robust neural architecture in four GPU hours. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp.  1761–1770. Computer Vision Foundation / IEEE, 2019a.
  10. Xuanyi Dong and Yi Yang. One-shot neural architecture search via self-evaluated template network. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp.  3680–3689. IEEE, 2019b.
  11. Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, 2020.
  12. Transnas-bench-101: Improving transferability and generalizability of cross-task neural architecture search. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pp.  5251–5260. Computer Vision Foundation / IEEE, 2021.
  13. BRP-NAS: prediction-based NAS using gcns. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
  14. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  15. Neural tangent kernel: Convergence and generalization in neural networks. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pp.  8580–8589, 2018.
  16. Nas-bench-suite-zero: Accelerating research on zero cost proxies. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2022.
  17. Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.
  18. Snip: Single-shot network pruning based on connection sensitivity. In International Conference on Learning Representations.
  19. Zico: Zero-shot NAS via inverse coefficient of variation on gradients. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
  20. Random search and reproducibility for neural architecture search. In Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, volume 115 of Proceedings of Machine Learning Research, pp.  367–377, 2020.
  21. Darts+: Improved differentiable architecture search with early stopping, 2019.
  22. Zen-nas: A zero-shot NAS for high-performance image recognition. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pp.  337–346. IEEE, 2021. doi: 10.1109/ICCV48922.2021.00040.
  23. Progressive neural architecture search. In Proceedings of the European Conference on Computer Vision (ECCV), September 2018.
  24. DARTS: differentiable architecture search. In International Conference on Learning Representations (ICLR), 2019.
  25. Epe-nas: Efficient performance estimation without training for neural architecture search. In Artificial Neural Networks and Machine Learning–ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14–17, 2021, Proceedings, Part V, pp.  552–563. Springer, 2021.
  26. Tnasp: A transformer-based nas predictor with a self-evolution framework. Advances in Neural Information Processing Systems, 34:15125–15137, 2021.
  27. Pinat: a permutation invariance augmented transformer for nas predictor. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.  8957–8965, 2023.
  28. Nsganetv2: Evolutionary multi-objective surrogate-assisted neural architecture search. In Proceedings of the European Conference on Computer Vision (ECCV), volume 12346, pp.  35–51. Springer, 2020.
  29. Semi-supervised neural architecture search. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
  30. Neural architecture search without training. In Marina Meila and Tong Zhang (eds.), Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp.  7588–7598. PMLR, 2021.
  31. Demystifying the neural tangent kernel from a practical perspective: Can it be trusted for neural architecture search without training? In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pp.  11851–11860. IEEE, 2022. doi: 10.1109/CVPR52688.2022.01156.
  32. On the number of linear regions of deep neural networks. In Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger (eds.), Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pp.  2924–2932, 2014.
  33. Rectified linear units improve restricted boltzmann machines. In Johannes Fürnkranz and Thorsten Joachims (eds.), Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel, pp.  807–814. Omnipress, 2010.
  34. Evaluating efficient performance estimators of neural architectures. Advances in Neural Information Processing Systems, 34:12265–12277, 2021.
  35. On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv preprint arXiv:1312.6098, 2013.
  36. Pre-nas: Evolutionary neural architecture search with predictor. IEEE Transactions on Evolutionary Computation, 27(1):26–36, 2023. doi: 10.1109/TEVC.2022.3227562.
  37. Efficient neural architecture search via parameters sharing. In Proceedings of the 35th International Conference on Machine Learning (ICML), pp.  4095–4104, 2018.
  38. Large-scale evolution of image classifiers. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp.  2902–2911, International Convention Centre, Sydney, Australia, 06–11 Aug 2017.
  39. Regularized evolution for image classifier architecture search. In AAAI Conference on Artificial Intelligence, volume 33, pp.  4780–4789, 2019.
  40. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Comput. Surv., 54(4):76:1–76:34, 2022.
  41. Bridging the gap between sample-based and one-shot neural architecture search with bonas. In Advances in Neural Information Processing Systems, volume 33, 2020.
  42. Nas-bench-301 and the case for surrogate benchmarks for neural architecture search. CoRR, abs/2008.09777, 2020.
  43. Evolving neural architecture using one shot model. In Francisco Chicano and Krzysztof Krawiec (eds.), GECCO ’21: Genetic and Evolutionary Computation Conference, Lille, France, July 10-14, 2021, pp.  910–918. ACM, 2021.
  44. Entropy-driven mixed-precision quantization for deep network design. In Advances in Neural Information Processing Systems, 2022.
  45. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  2818–2826, 2016.
  46. Pruning neural networks without any data by iteratively conserving synaptic flow. Advances in neural information processing systems, 33:6377–6389, 2020a.
  47. Pruning neural networks without any data by iteratively conserving synaptic flow. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020b.
  48. Blockswap: Fisher-guided block substitution for network compression on a budget. In International Conference on Learning Representations, 2020.
  49. Picking winning tickets before training by preserving gradient flow. In International Conference on Learning Representations.
  50. Neural predictor for neural architecture search. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXIX, volume 12374, pp.  660–676. Springer, 2020.
  51. Neural architecture search: Insights from 1000 papers. CoRR, abs/2301.08727, 2023.
  52. Weight-sharing neural architecture search: A battle to shrink the optimization gap. ACM Computing Surveys (CSUR), 54(9):1–37, 2021.
  53. On the number of linear regions of convolutional neural networks. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pp.  10514–10523. PMLR, 2020.
  54. PC-DARTS: partial channel connections for memory-efficient differentiable architecture search. In International Conference on Learning Representations (ICLR), 2020.
  55. CARS: continuous evolution for efficient neural architecture search. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp.  1826–1835. IEEE, 2020.
  56. Nas-bench-101: Towards reproducible neural architecture search. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97, pp.  7105–7114. PMLR, 2019.
  57. Taskonomy: Disentangling task transfer learning. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pp.  3712–3722. Computer Vision Foundation / IEEE Computer Society, 2018.
  58. Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell., 40(6):1452–1464, 2018.
  59. Econas: Finding proxies for economical neural architecture search. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp.  11393–11401. IEEE, 2020.
  60. EENA: efficient evolution of neural architecture. In 2019 IEEE/CVF International Conference on Computer Vision Workshops, ICCV Workshops 2019, Seoul, Korea (South), October 27-28, 2019, pp.  1891–1899. IEEE, 2019.
  61. Neural architecture search with reinforcement learning. In International Conference on Learning Representations (ICLR), 2017.
  62. Learning transferable architectures for scalable image recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.