Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GENNAPE: Towards Generalized Neural Architecture Performance Estimators (2211.17226v2)

Published 30 Nov 2022 in cs.LG and cs.CV

Abstract: Predicting neural architecture performance is a challenging task and is crucial to neural architecture design and search. Existing approaches either rely on neural performance predictors which are limited to modeling architectures in a predefined design space involving specific sets of operators and connection rules, and cannot generalize to unseen architectures, or resort to zero-cost proxies which are not always accurate. In this paper, we propose GENNAPE, a Generalized Neural Architecture Performance Estimator, which is pretrained on open neural architecture benchmarks, and aims to generalize to completely unseen architectures through combined innovations in network representation, contrastive pretraining, and fuzzy clustering-based predictor ensemble. Specifically, GENNAPE represents a given neural network as a Computation Graph (CG) of atomic operations which can model an arbitrary architecture. It first learns a graph encoder via Contrastive Learning to encourage network separation by topological features, and then trains multiple predictor heads, which are soft-aggregated according to the fuzzy membership of a neural network. Experiments show that GENNAPE pretrained on NAS-Bench-101 can achieve superior transferability to 5 different public neural network benchmarks, including NAS-Bench-201, NAS-Bench-301, MobileNet and ResNet families under no or minimum fine-tuning. We further introduce 3 challenging newly labelled neural network benchmarks: HiAML, Inception and Two-Path, which can concentrate in narrow accuracy ranges. Extensive experiments show that GENNAPE can correctly discern high-performance architectures in these families. Finally, when paired with a search algorithm, GENNAPE can find architectures that improve accuracy while reducing FLOPs on three families.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems.
  2. Zero-Cost Proxies for Lightweight NAS. In International Conference on Learning Representations (ICLR).
  3. FCM: The Fuzzy C-Means Clustering Algorithm. Computers & Geosciences, 10(2-3): 191–203.
  4. Once for All: Train One Network and Specialize it for Efficient Deployment. In International Conference on Learning Representations.
  5. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. In International Conference on Learning Representations.
  6. Generative Adversarial Neural Architecture Search. In Zhou, Z.-H., ed., Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, 2227–2234. International Joint Conferences on Artificial Intelligence Organization. Main Track.
  7. A Simple Framework for Contrastive Learning of Visual Representations. In International conference on machine learning, 1597–1607. PMLR.
  8. NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search. In International Conference on Learning Representations.
  9. A Generalization of Transformer Networks to Graphs. AAAI Workshop on Deep Learning on Graphs: Methods and Applications.
  10. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Empirical Methods in Natural Language Processing (EMNLP).
  11. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  12. Searching For MobileNetV3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1314–1324.
  13. Supervised Contrastive Learning. Advances in Neural Information Processing Systems, 33: 18661–18673.
  14. Learning Multiple Layers of Features From Tiny Images. Technical Report.
  15. DARTS: Differentiable Architecture Search. In International Conference on Learning Representations (ICLR).
  16. TNASP: A Transformer-based NAS Predictor with a Self-Evolution Framework. In Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.; and Vaughan, J. W., eds., Advances in Neural Information Processing Systems, volume 34, 15125–15137. Curran Associates, Inc.
  17. Semi-Supervised Neural Architecture Search. Advances in Neural Information Processing Systems, 33: 10547–10557.
  18. L22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTNAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 1284–1293.
  19. Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 4026–4035.
  20. Exploring Neural Architecture Search Space via Deep Deterministic Sampling. IEEE Access, 9: 110962–110974.
  21. Weisfeiler and Leman Go Neural: Higher-Order Graph Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 4602–4609.
  22. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems 32, 8024–8035. Curran Associates, Inc.
  23. A Genetic Reduction of Feature Space in the Design of Fuzzy Models. Applied Soft Computing, 12(9): 2801–2816.
  24. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3): 211–252.
  25. Inception-V4, Inception-ResNet and the Impact of Residual Connections on Learning. In Thirty-first AAAI Conference on Artificial Intelligence.
  26. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2818–2826.
  27. A Semi-Supervised Assessor of Neural Architectures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1810–1819.
  28. Attention is all you need. Advances in Neural Information Processing Systems, 30.
  29. Semi-Supervised Classification With Graph Convolutional Networks. In J. International Conference on Learning Representations (ICLR 2017).
  30. Neural Predictor for Neural Architecture Search. In European Conference on Computer Vision, 660–676. Springer.
  31. BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 10293–10301.
  32. Principal Component Analysis. Chemometrics and Intelligent Laboratory Systems, 2(1-3): 37–52.
  33. How Powerful are Graph Neural Networks? In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019.
  34. Interpretable Deep Convolutional Fuzzy Classifier. IEEE Transactions on Fuzzy Systems, 28(7): 1407–1419.
  35. NAS-Bench-101: Towards Reproducible Neural Architecture Search. In International Conference on Machine Learning, 7105–7114.
  36. Surrogate NAS Benchmarks: Going Beyond the Limited Search Spaces of Tabular NAS Benchmarks. In International Conference on Learning Representations.
  37. AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing. arXiv preprint arXiv:2108.03001.
  38. Neural Architecture Search with Reinforcement Learning. In International Conference on Learning Representations.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Keith G. Mills (14 papers)
  2. Fred X. Han (10 papers)
  3. Jialin Zhang (87 papers)
  4. Fabian Chudak (7 papers)
  5. Ali Safari Mamaghani (1 paper)
  6. Mohammad Salameh (20 papers)
  7. Wei Lu (325 papers)
  8. Shangling Jui (36 papers)
  9. Di Niu (67 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.