Multi-conditioned Graph Diffusion for Neural Architecture Search
Abstract: Neural architecture search automates the design of neural network architectures usually by exploring a large and thus complex architecture search space. To advance the architecture search, we present a graph diffusion-based NAS approach that uses discrete conditional graph diffusion processes to generate high-performing neural network architectures. We then propose a multi-conditioned classifier-free guidance approach applied to graph diffusion networks to jointly impose constraints such as high accuracy and low hardware latency. Unlike the related work, our method is completely differentiable and requires only a single model training. In our evaluations, we show promising results on six standard benchmarks, yielding novel and unique architectures at a fast speed, i.e. less than 0.2 seconds per architecture. Furthermore, we demonstrate the generalisability and efficiency of our method through experiments on ImageNet dataset.
- Diffusionnag: Task-guided neural architecture generation with diffusion models. arXiv preprint arXiv:2305.16943, 2023.
- Structured denoising diffusion models in discrete state-spaces. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, volume 34, pp. 17981–17993. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper_files/paper/2021/file/958c530554f78bcd8e97125b70e6973d-Paper.pdf.
- Designing neural network architectures using reinforcement learning. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=S1c2cvqee.
- Understanding and simplifying one-shot architecture search. In Jennifer Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp. 550–559. PMLR, 10–15 Jul 2018. URL https://proceedings.mlr.press/v80/bender18a.html.
- SMASH: One-shot model architecture search through hypernetworks. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rydeCEhs-.
- Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785–794, 2016.
- Wuyang Chen. Darts evaluation: Train from scratch for architectures from darts space, 2022. URL https://github.com/chenwydj/DARTS_evaluation.
- Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. In International Conference on Learning Representations, 2021a.
- Drnas: Dirichlet neural architecture search. In International Conference on Learning Representations, 2021b. URL https://openreview.net/forum?id=9FWas6YbmB3.
- Multi-objective reinforced evolution in mobile neural architecture search. In European Conference on Computer Vision, pp. 99–113. Springer, 2020.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, 2009. doi: 10.1109/CVPR.2009.5206848.
- Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
- Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations (ICLR), 2020.
- A generalization of transformer networks to graphs. AAAI Workshop on Deep Learning on Graphs: Methods and Applications, 2021.
- Neural architecture search: A survey. Journal of Machine Learning Research, 20(55):1–21, 2019. URL http://jmlr.org/papers/v20/18-598.html.
- Sample-efficient automated deep reinforcement learning. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=hSjxQ3B7GWq.
- Conditioning diffusion models via attributes and semantic masks for face generation. arXiv preprint arXiv:2306.00914, 2023.
- Classifier-free diffusion guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
- Equivariant diffusion for molecule generation in 3d. In International conference on machine learning, pp. 8867–8887. PMLR, 2022.
- Searching for mobilenetv3. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 1314–1324, 2019.
- Searching by generating: Flexible and efficient one-shot nas with architecture generator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 983–992, 2021.
- Auto-keras: An efficient neural architecture search system. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 1946–1956, 2019.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Nas-bench-nlp: Neural architecture search benchmark for natural language processing. IEEE Access, PP:1–1, 2020.
- Learning multiple layers of features from tiny images. 2009.
- Rapid neural architecture search by learning to generate graphs from datasets. In International Conference on Learning Representations, 2021.
- {HW}-{nas}-bench: Hardware-aware neural architecture search benchmark. In International Conference on Learning Representations, 2021.
- Random search and reproducibility for neural architecture search. In Uncertainty in artificial intelligence, pp. 367–377. PMLR, 2020.
- DARTS: Differentiable architecture search. In International Conference on Learning Representations, 2019.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
- Smooth variational graph embeddings for efficient neural architecture search. In 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, 2021.
- Learning where to look–generative nas is surprisingly efficient. In European Conference on Computer Vision, pp. 257–273. Springer, 2022.
- Neural architecture optimization. Advances in neural information processing systems, 31, 2018.
- Surgenas: a comprehensive surgery on hardware-aware differentiable neural architecture search. IEEE Transactions on Computers, 72(4):1081–1094, 2022.
- Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330, 1993. URL https://aclanthology.org/J93-2004.
- Efficient neural architecture search via parameters sharing. In International conference on machine learning, pp. 4095–4104. PMLR, 2018.
- Large-scale evolution of image classifiers. In Doina Precup and Yee Whye Teh (eds.), Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp. 2902–2911. PMLR, 06–11 Aug 2017.
- Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pp. 4780–4789, 2019.
- Generative adversarial neural architecture search. arXiv preprint arXiv:2105.09356, 2021.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10684–10695, 2022.
- Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
- Transfer NAS with meta-learned bayesian surrogates. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=paGvsrl4Ntr.
- {NAS}-bench-301 and the case for surrogate benchmarks for neural architecture search, 2021.
- Scalable bayesian optimization using deep neural networks. In International conference on machine learning, pp. 2171–2180. PMLR, 2015.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pp. 2256–2265. PMLR, 2015.
- Enhancing differentiable architecture search: A study on small number of cell blocks in the search stage, and important branches-based cells selection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pp. 1253–1261, October 2023.
- Off-policy reinforcement learning for efficient and effective gan architecture search. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, pp. 175–192. Springer, 2020.
- Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017a. URL https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
- Attention is all you need. Advances in neural information processing systems, 30, 2017b.
- Digress: Discrete denoising diffusion for graph generation. In The Eleventh International Conference on Learning Representations, 2023.
- An invertible graph diffusion neural network for source localization. In Proceedings of the ACM Web Conference 2022, WWW ’22, pp. 1058–1069, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450390965. doi: 10.1145/3485447.3512155. URL https://doi.org/10.1145/3485447.3512155.
- Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 10293–10301, 2021a.
- Exploring the loss landscape in neural architecture search. In Uncertainty in Artificial Intelligence, pp. 654–664. PMLR, 2021b.
- Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10734–10742, 2019.
- Stronger nas with weaker predictors. Advances in Neural Information Processing Systems, 34:28904–28918, 2021.
- Does unsupervised architecture representation learning help neural architecture search? Advances in neural information processing systems, 33:12486–12498, 2020.
- Nas-bench-x11 and the power of learning curves. Advances in Neural Information Processing Systems, 34:22534–22549, 2021a.
- Nas-bench-x11 and the power of learning curves. Advances in Neural Information Processing Systems, 34:22534–22549, 2021b.
- Sweet gradient matters: Designing consistent and efficient estimator for zero-shot architecture search. Neural Networks, 168:237–255, 2023. ISSN 0893-6080. doi: https://doi.org/10.1016/j.neunet.2023.09.012.
- Ista-nas: Efficient and consistent neural architecture search by sparse coding. Advances in Neural Information Processing Systems, 33:10503–10513, 2020.
- beta𝑏𝑒𝑡𝑎betaitalic_b italic_e italic_t italic_a-darts: Beta-decay regularization for differentiable architecture search. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10864–10873. IEEE, 2022.
- NAS-bench-101: Towards reproducible neural architecture search. In Kamalika Chaudhuri and Ruslan Salakhutdinov (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 7105–7114, Long Beach, California, USA, 09–15 Jun 2019. PMLR.
- D-vae: A variational autoencoder for directed acyclic graphs. Advances in Neural Information Processing Systems, 32, 2019.
- Neural architecture search with reinforcement learning. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=r1Ue8Hcxg.
- Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8697–8710, 2018.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.