Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EvoPrompting: Language Models for Code-Level Neural Architecture Search (2302.14838v3)

Published 28 Feb 2023 in cs.NE, cs.AI, cs.CL, and cs.LG

Abstract: Given the recent impressive accomplishments of LLMs (LMs) for code generation, we explore the use of LMs as adaptive mutation and crossover operators for an evolutionary neural architecture search (NAS) algorithm. While NAS still proves too difficult a task for LMs to succeed at solely through prompting, we find that the combination of evolutionary prompt engineering with soft prompt-tuning, a method we term EvoPrompting, consistently finds diverse and high performing models. We first demonstrate that EvoPrompting is effective on the computationally efficient MNIST-1D dataset, where EvoPrompting produces convolutional architecture variants that outperform both those designed by human experts and naive few-shot prompting in terms of accuracy and model size. We then apply our method to searching for graph neural networks on the CLRS Algorithmic Reasoning Benchmark, where EvoPrompting is able to design novel architectures that outperform current state-of-the-art models on 21 out of 30 algorithmic reasoning tasks while maintaining similar model size. EvoPrompting is successful at designing accurate and efficient neural network architectures across a variety of machine learning tasks, while also being general enough for easy adaptation to other tasks beyond neural network design.

Citations (62)

Summary

  • The paper introduces EvoPrompting, integrating evolutionary prompt engineering and soft prompt-tuning to optimize neural architecture search.
  • The method achieves superior results on MNIST-1D and CLRS benchmarks by producing architectures with lower test errors and reduced model size.
  • The approach minimizes reliance on predefined search spaces, showcasing language models as adaptive operators in autonomous architecture discovery.

Overview of EvoPrompting: LLMs for Code-Level Neural Architecture Search

The paper "EvoPrompting: LLMs for Code-Level Neural Architecture Search" explores the utilization of LLMs (LMs) in neural architecture search (NAS), focusing on an innovative method known as EvoPrompting. This method synergizes the adaptive capabilities of LMs with evolutionary algorithms to address the complexity of NAS more efficiently than traditional methods.

Methodological Insights

EvoPrompting integrates two key techniques: evolutionary prompt engineering and soft prompt-tuning. These approaches aim to harness LMs not just as passive code generators but as active components in the evolutionary process. The method proceeds through iterative cycles of mutation and crossover, leveraging the variability of prompts to cultivate neural network architectures that achieve high performance across diverse tasks.

The evolutionary component of EvoPrompting eschews the need for meticulously designed search spaces, common in other NAS techniques. Instead, it utilizes the LM's vocabulary, expanding the potential architectures that can be explored and reducing human bias. This expands the flexibility of the search process and allows LMs to act as adaptive mutation operators, refined progressively via prompt-tuning.

Experimental Results

The efficacy of EvoPrompting was validated through extensive testing on both MNIST-1D and the CLRS Algorithmic Reasoning Benchmark. On MNIST-1D, EvoPrompting demonstrated its capacity to design efficient convolutional architectures that surpass human-expert designs in accuracy and size—yielding models with significantly lower test errors and smaller parameter counts than comparative baselines.

Furthermore, when applied to graph neural networks (GNNs) on the CLRS benchmark, EvoPrompting generated novel architectures that surpassed the existing state-of-the-art models on 21 out of 30 tasks. These improvements were achieved without compromising the model size, highlighting the capacity of EvoPrompting to discover model architectures that excel in both accuracy and computational efficiency.

Implications and Speculations for AI Development

This paper illustrates significant advancements in leveraging LMs beyond conventional language tasks, extending their applicability to sophisticated model design and optimization domains like NAS. The demonstrated ability of LMs to efficiently search and optimize over large and less manually constrained architecture spaces suggests potential applications in broader AI development and deployment scenarios.

From a practical standpoint, EvoPrompting could reduce the computational and cognitive burdens typically associated with NAS, fostering quicker deployment of optimized AI solutions across various industries. Theoretically, the findings may encourage further exploration of LMs as proactive agents in complex procedural tasks, potentially evolving into more autonomous entities in research and development processes.

Future trajectories might include scaling EvoPrompting methodologies to larger datasets and more intricate tasks, assessing the scalability and robustness of discovered models further. Additionally, integrating more sophisticated feedback loops for adaptive learning and fine-tuning could enhance the EvoPrompting framework, capitalizing on LMs' evolving capabilities and improving their utility in AI advancement.

Youtube Logo Streamline Icon: https://streamlinehq.com