LeMo-NADe: Multi-Parameter Neural Architecture Discovery with LLMs (2402.18443v1)
Abstract: Building efficient neural network architectures can be a time-consuming task requiring extensive expert knowledge. This task becomes particularly challenging for edge devices because one has to consider parameters such as power consumption during inferencing, model size, inferencing speed, and CO2 emissions. In this article, we introduce a novel framework designed to automatically discover new neural network architectures based on user-defined parameters, an expert system, and an LLM trained on a large amount of open-domain knowledge. The introduced framework (LeMo-NADe) is tailored to be used by non-AI experts, does not require a predetermined neural architecture search space, and considers a large set of edge device-specific parameters. We implement and validate this proposed neural architecture discovery framework using CIFAR-10, CIFAR-100, and ImageNet16-120 datasets while using GPT-4 Turbo and Gemini as the LLM component. We observe that the proposed framework can rapidly (within hours) discover intricate neural network models that perform extremely well across a diverse set of application settings defined by the user.
- Tackling prediction uncertainty in machine learning for healthcare. Nature Biomedical Engineering, 7(6):711–718, 2023.
- An enhanced reversible data hiding algorithm using deep neural network for e-healthcare. Journal of Ambient Intelligence and Humanized Computing, 14(8):10567–10585, 2023.
- An intelligent heart disease prediction system based on swarm-artificial neural network. Neural Computing and Applications, 35(20):14723–14737, 2023.
- Multimodal fusion methods with deep neural networks and meta-information for aggression detection in surveillance. Expert Systems with Applications, 211:118523, 2023.
- A robust framework to generate surveillance video summaries using combination of zernike moments and r-transform and deep neural network. Multimedia Tools and Applications, 82(9):13811–13835, 2023.
- Artificial intelligence for industry 4.0: Systematic review of applications, challenges, and opportunities. Expert Systems with Applications, 216:119456, 2023.
- Industry 5.0 or industry 4.0 s? introduction to industry 4.0 and a peek into the prospective industry 5.0 technologies. International Journal on Interactive Design and Manufacturing (IJIDeM), 17(2):947–979, 2023.
- Continuous quality control evaluation during manufacturing using supervised learning algorithm for industry 4.0. The International Journal of Advanced Manufacturing Technology, pages 1–10, 2023.
- Fault prediction using fuzzy convolution neural network on iot environment with heterogeneous sensing data fusion. Measurement: Sensors, 26:100701, 2023.
- Kazi Kutubuddin Sayyad Liyakat. Machine learning approach using artificial neural networks to detect malicious nodes in iot networks. In International Conference on Machine Learning, IoT and Big Data, pages 123–134. Springer, 2023.
- Attack classification of imbalanced intrusion data for iot network using ensemble learning-based deep neural network. IEEE Internet of Things Journal, 2023.
- R OpenAI. Gpt-4 technical report. arxiv 2303.08774. View in Article, 2:13, 2023.
- El-nas: Efficient lightweight attention cross-domain architecture search for hyperspectral image classification. Remote Sensing, 15(19):4688, 2023.
- Om-nas: pigmented skin lesion image classification based on a neural architecture search. Biomedical Optics Express, 14(5):2153–2165, 2023.
- A trustworthy neural architecture search framework for pneumonia image classification utilizing blockchain technology. The Journal of Supercomputing, pages 1–34, 2023.
- Rd-nas: Enhancing one-shot supernet ranking ability via ranking distillation from zero-cost proxies. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023.
- Nas-dymc: Nas-based dynamic multi-scale convolutional neural network for sound event detection. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023.
- Graph neural network architecture search for rotating machinery fault diagnosis based on reinforcement learning. Mechanical Systems and Signal Processing, 202:110701, 2023.
- Ssob: searching a scene-oriented architecture for underwater object detection. The Visual Computer, 39(11):5199–5208, 2023.
- Fast and accurate object detector for autonomous driving based on improved yolov5. Scientific reports, 13(1):1–13, 2023.
- A natural language processing approach to malware classification. Journal of Computer Virology and Hacking Techniques, pages 1–12, 2023.
- Benchmarking nas for article separation in historical newspapers. In International Conference on Asian Digital Libraries, pages 76–88. Springer, 2023.
- Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pages 4780–4789, 2019.
- Progressive neural architecture search. In Proceedings of the European conference on computer vision (ECCV), pages 19–34, 2018.
- Efficient architecture search by network transformation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Efficient neural architecture search via parameters sharing. In International conference on machine learning, pages 4095–4104. PMLR, 2018.
- Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018.
- Nas-bench-101: Towards reproducible neural architecture search. In International conference on machine learning, pages 7105–7114. PMLR, 2019.
- Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. arXiv preprint arXiv:2001.00326, 2020.
- Learning multiple layers of features from tiny images. 2009.
- A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819, 2017.
- β𝛽\betaitalic_β-darts: Beta-decay regularization for differentiable architecture search. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10864–10873. IEEE, 2022.
- λ𝜆\lambdaitalic_λ -darts: Mitigating performance collapse by harmonizing operation selection among cells. arXiv preprint arXiv:2210.07998, 2022.
- Can gpt-4 perform neural architecture search? arXiv preprint arXiv:2304.10970, 2023.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- Graph neural architecture search with gpt-4. arXiv preprint arXiv:2310.01436, 2023.
- Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023.
- Energy and policy considerations for deep learning in nlp. arXiv preprint arXiv:1906.02243, 2019.
- Dsnas: Direct neural architecture search without parameter retraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12084–12092, 2020.
- Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737, 2019.
- idarts: Differentiable architecture search with stochastic implicit gradients. In International Conference on Machine Learning, pages 12557–12566. PMLR, 2021.
- Xuanyi Dong and Yi Yang. Searching for a robust neural architecture in four gpu hours. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1761–1770, 2019.