Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 37 tok/s Pro
GPT-5 Medium 33 tok/s
GPT-5 High 35 tok/s Pro
GPT-4o 92 tok/s
GPT OSS 120B 441 tok/s Pro
Kimi K2 227 tok/s Pro
2000 character limit reached

DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning (2302.13020v2)

Published 25 Feb 2023 in cs.LG

Abstract: Neural predictors have shown great potential in the evaluation process of neural architecture search (NAS). However, current predictor-based approaches overlook the fact that training a predictor necessitates a considerable number of trained neural networks as the labeled training set, which is costly to obtain. Therefore, the critical issue in utilizing predictors for NAS is to train a high-performance predictor using as few trained neural networks as possible. Although some methods attempt to address this problem through unsupervised learning, they often result in inaccurate predictions. We argue that the unsupervised tasks intended for the common graph data are too challenging for neural networks, causing unsupervised training to be susceptible to performance crashes in NAS. To address this issue, we propose a Curricumum-guided Contrastive Learning framework for neural Predictor (DCLP). Our method simplifies the contrastive task by designing a novel curriculum to enhance the stability of unlabeled training data distribution during contrastive training. Specifically, we propose a scheduler that ranks the training data according to the contrastive difficulty of each data and then inputs them to the contrastive learner in order. This approach concentrates the training data distribution and makes contrastive training more efficient. By using our method, the contrastive learner incrementally learns feature representations via unsupervised data on a smooth learning curve, avoiding performance crashes that may occur with excessively variable training data distributions. We experimentally demonstrate that DCLP has high accuracy and efficiency compared with existing predictors, and shows promising potential to discover superior architectures in various search spaces when combined with search strategies. Our code is available at: https://github.com/Zhengsh123/DCLP.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, 41–48.
  2. Random search for hyper-parameter optimization. Journal of machine learning research, 13(2).
  3. Burges, C. J. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning, 11(23-581): 81.
  4. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. In International Conference on Learning Representations.
  5. FreeREA: Training-Free Evolution-based Architecture Search. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 1493–1502.
  6. Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective. In International Conference on Learning Representations.
  7. Contrastive neural architecture search with neural architecture comparators. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9502–9511.
  8. Not All Operations Contribute Equally: Hierarchical Operation-adaptive Predictor for Neural Architecture Search. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 10508–10517.
  9. CuCo: Graph Representation with Curriculum Contrastive Learning. In IJCAI, 2300–2306.
  10. Darts-: robustly stepping out of performance collapse without indicators. arXiv preprint arXiv:2009.01027.
  11. {DARTS}-: Robustly Stepping out of Performance Collapse Without Indicators. In International Conference on Learning Representations.
  12. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. Minneapolis, Minnesota: Association for Computational Linguistics.
  13. NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search. In International Conference on Learning Representations.
  14. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9729–9738.
  15. AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, 212: 106622.
  16. Greedynasv2: Greedier search with a greedy path filter. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11902–11911.
  17. Graph Masked Autoencoder Enhanced Predictor for Neural Architecture Search. In Raedt, L. D., ed., Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, 3114–3120. International Joint Conferences on Artificial Intelligence Organization. Main Track.
  18. Self-supervised visual feature learning with deep neural networks: A survey. IEEE transactions on pattern analysis and machine intelligence, 43(11): 4037–4058.
  19. Kipf, T. N. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
  20. ZiCo: Zero-shot NAS via inverse Coefficient of Variation on Gradients. In The Eleventh International Conference on Learning Representations.
  21. DARTS: Differentiable Architecture Search. In International Conference on Learning Representations.
  22. Homogeneous architecture augmentation for neural predictor. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 12249–12258.
  23. Neural architecture search without training. In International Conference on Machine Learning, 7588–7598. PMLR.
  24. A generic graph-based neural architecture encoding scheme for predictor-based nas. In European Conference on Computer Vision, 189–204. Springer.
  25. Distribution Consistent Neural Architecture Search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10884–10893.
  26. Curriculum learning of multiple tasks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5492–5500.
  27. Competence-based Curriculum Learning for Neural Machine Translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 1162–1172. Minneapolis, Minnesota: Association for Computational Linguistics.
  28. NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
  29. Sen, P. K. 1968. Estimates of the regression coefficient based on Kendall’s tau. Journal of the American statistical association, 63(324): 1379–1389.
  30. A semi-supervised assessor of neural architectures. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 1810–1819.
  31. RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform Successive Halving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 10377–10386.
  32. A survey on curriculum learning. IEEE Transactions on Pattern Analysis and Machine Intelligence.
  33. Self-supervised representation learning for evolutionary neural architecture search. IEEE Computational Intelligence Magazine, 16(3): 33–49.
  34. Neural predictor for neural architecture search. In European Conference on Computer Vision, 660–676. Springer.
  35. Self-supervised learning on graphs: Contrastive, generative, or predictive. IEEE Transactions on Knowledge and Data Engineering.
  36. Architecture augmentation for performance predictor based on graph isomorphism. arXiv preprint arXiv:2207.00987.
  37. How Powerful are Graph Neural Networks? In International Conference on Learning Representations.
  38. Renas: Relativistic evaluation of neural architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4411–4420.
  39. Cate: Computation-aware neural architecture encoding with transformers. In International Conference on Machine Learning, 11670–11681. PMLR.
  40. NAS evaluation is frustratingly hard. In International Conference on Learning Representations.
  41. b-darts: Beta-decay regularization for differentiable architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10874–10883.
  42. Nas-bench-101: Towards reproducible neural architecture search. In International Conference on Machine Learning, 7105–7114. PMLR.
  43. Dcnas: Densely connected neural architecture search for semantic image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 13956–13967.
  44. Curriculum-nas: Curriculum weight-sharing neural architecture search. In Proceedings of the 30th ACM International Conference on Multimedia, 6792–6801.
  45. Neural Architecture Search with Reinforcement Learning. In International Conference on Learning Representations.
  46. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 8697–8710.
Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.