Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Building Optimal Neural Architectures using Interpretable Knowledge (2403.13293v1)

Published 20 Mar 2024 in cs.CV, cs.AI, and cs.LG

Abstract: Neural Architecture Search is a costly practice. The fact that a search space can span a vast number of design choices with each architecture evaluation taking nontrivial overhead makes it hard for an algorithm to sufficiently explore candidate networks. In this paper, we propose AutoBuild, a scheme which learns to align the latent embeddings of operations and architecture modules with the ground-truth performance of the architectures they appear in. By doing so, AutoBuild is capable of assigning interpretable importance scores to architecture modules, such as individual operation features and larger macro operation sequences such that high-performance neural networks can be constructed without any need for search. Through experiments performed on state-of-the-art image classification, segmentation, and Stable Diffusion models, we show that by mining a relatively small set of evaluated architectures, AutoBuild can learn to build high-quality architectures directly or help to reduce search space to focus on relevant areas, finding better architectures that outperform both the original labeled ones and ones found by search baselines. Code available at https://github.com/Ascend-Research/AutoBuild

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Onnx: Open neural network exchange. https://github.com/onnx/onnx, 2019.
  2. Fast differentiable sorting and ranking. In International Conference on Machine Learning, pages 950–959. PMLR, 2020.
  3. How attentive are graph attention networks? In International Conference on Learning Representations, 2022.
  4. Once for all: Train one network and specialize it for efficient deployment. In International Conference on Learning Representations, 2020.
  5. Contrastive neural architecture search with neural architecture comparators. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9502–9511, 2021.
  6. Fbnetv3: Joint architecture-recipe search using predictor pretraining. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pages 16276–16285. Computer Vision Foundation / IEEE, 2021.
  7. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  8. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12873–12883, 2021.
  9. Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
  10. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  11. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  12. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1314–1324, 2019.
  13. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
  14. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7132–7141, 2018.
  15. Panoptic feature pyramid networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6399–6408, 2019.
  16. Snapfusion: Text-to-image diffusion model on mobile devices within two seconds. In Advances in Neural Information Processing Systems, pages 20662–20678. Curran Associates, Inc., 2023.
  17. Ascend: a scalable and unified architecture for ubiquitous deep neural network computing : Industry track paper. 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), pages 789–801, 2021.
  18. Microsoft coco: Common objects in context. In Computer Vision – ECCV 2014, pages 740–755, Cham, 2014. Springer International Publishing.
  19. Darts: Differentiable architecture search. In International Conference on Learning Representations (ICLR), 2019.
  20. A survey on evolutionary neural architecture search. IEEE transactions on neural networks and learning systems, 2021.
  21. Decoupled weight decay regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.
  22. Pinat: A permutation invariance augmented transformer for nas predictor. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.
  23. Nsga-net: neural architecture search using multi-objective genetic algorithm. In Proceedings of the genetic and evolutionary computation conference, pages 419–427, 2019.
  24. Nas-bench-suite: NAS evaluation is (now) surprisingly easy. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
  25. Profiling neural blocks and design spaces for mobile neural architecture search. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, page 4026–4035, New York, NY, USA, 2021. Association for Computing Machinery.
  26. Gennape: Towards generalized neural architecture performance estimators. Proceedings of the AAAI Conference on Artificial Intelligence, 37(8):9190–9199, 2023a.
  27. Aio-p: Expanding neural performance predictors beyond image classification. Proceedings of the AAAI Conference on Artificial Intelligence, 37(8):9180–9189, 2023b.
  28. Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 4602–4609, 2019.
  29. OpenAI. Gpt-4 technical report, 2023.
  30. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
  31. Efficient neural architecture search via parameters sharing. In International Conference on Machine Learning, pages 4095–4104. PMLR, 2018.
  32. Sdxl: Improving latent diffusion models for high-resolution image synthesis. arXiv preprint arXiv:2307.01952, 2023.
  33. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022.
  34. Interpretable neural architecture search via bayesian optimisation with weisfeiler-lehman kernels. In ICLR, 2021.
  35. Autogo: Automated computation graph optimization for neural network evolution. In Advances in Neural Information Processing Systems, pages 74455–74477. Curran Associates, Inc., 2023.
  36. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
  37. Surrogate-assisted evolutionary deep learning using an end-to-end random forest-based performance predictor. IEEE Transactions on Evolutionary Computation, 24(2):350–364, 2020.
  38. Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning, pages 6105–6114. PMLR, 2019.
  39. Attention is all you need. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2017.
  40. Graph attention networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018.
  41. Fbnetv2: Differentiable neural architecture search for spatial and channel dimensions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12965–12974, 2020.
  42. Semi-supervised classification with graph convolutional networks. In J. International Conference on Learning Representations (ICLR 2017), 2016.
  43. How powerful are performance predictors in neural architecture search? Advances in Neural Information Processing Systems, 34:28454–28469, 2021.
  44. Detectron2. https://github.com/facebookresearch/detectron2, 2019. Accessed: 2022-08-15.
  45. How powerful are graph neural networks? In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
  46. Representation learning for frequent subgraph mining, 2024.
  47. Design space for graph neural networks. Advances in Neural Information Processing Systems, 33, 2020.
  48. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Keith G. Mills (14 papers)
  2. Fred X. Han (10 papers)
  3. Mohammad Salameh (20 papers)
  4. Shengyao Lu (6 papers)
  5. Chunhua Zhou (4 papers)
  6. Jiao He (32 papers)
  7. Fengyu Sun (15 papers)
  8. Di Niu (67 papers)

Summary

We haven't generated a summary for this paper yet.