Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can GPT-4 Perform Neural Architecture Search? (2304.10970v4)

Published 21 Apr 2023 in cs.LG

Abstract: We investigate the potential of GPT-4~\cite{gpt4} to perform Neural Architecture Search (NAS) -- the task of designing effective neural architectures. Our proposed approach, \textbf{G}PT-4 \textbf{E}nhanced \textbf{N}eural arch\textbf{I}tect\textbf{U}re \textbf{S}earch (GENIUS), leverages the generative capabilities of GPT-4 as a black-box optimiser to quickly navigate the architecture search space, pinpoint promising candidates, and iteratively refine these candidates to improve performance. We assess GENIUS across several benchmarks, comparing it with existing state-of-the-art NAS techniques to illustrate its effectiveness. Rather than targeting state-of-the-art performance, our objective is to highlight GPT-4's potential to assist research on a challenging technical problem through a simple prompting scheme that requires relatively limited domain expertise\footnote{Code available at \href{https://github.com/mingkai-zheng/GENIUS}{https://github.com/mingkai-zheng/GENIUS}.}. More broadly, we believe our preliminary results point to future research that harnesses general purpose LLMs for diverse optimisation tasks. We also highlight important limitations to our study, and note implications for AI safety.

Citations (48)

Summary

  • The paper introduces the GENIUS framework, leveraging GPT-4 as a black-box optimizer to iteratively propose and evaluate neural network architectures.
  • It demonstrates superior performance by ranking in the top 0.12% on NAS benchmarks and achieving 77.8%–78.2% Top-1 accuracy on ImageNet under resource constraints.
  • The study highlights the potential to democratize complex NAS tasks with minimal domain expertise while raising important questions about AI transparency and safety.

The paper "Can GPT-4 Perform Neural Architecture Search?" explores the feasibility of using the GPT-4 LLM as a tool to assist in Neural Architecture Search (NAS), a task traditionally reliant on high levels of domain-specific expertise. Moreover, this research proposes the GPT-4 Enhanced Neural archItectUre Search (GENIUS) framework as a method leveraging GPT-4's generative capabilities to navigate architectural search spaces, identify promising designs, and iteratively refine them.

Summary of Approach

GENIUS utilizes GPT-4 as a black-box optimizer, an approach entailing encoding the architectural search problem into a format that GPT-4 can interpret. The model subsequently generates a proposal for a network architecture, which is then evaluated against a specific performance metric, such as accuracy on a benchmark dataset. This performance is fed back to GPT-4 iteratively to prompt improved model configurations.

The authors tested this method across several well-established NAS benchmarks, namely NAS-Bench-Macro, Channel-Bench-Macro, and NAS-Bench-201. Through experiments, GENIUS demonstrated effectiveness in identifying high-performing architectures, with some being ranked within the top 1% across search spaces. For larger-scale experiments, GENIUS was tested on the ImageNet dataset within the MobileNetV2 search space, achieving competitive results in terms of accuracy and the overall efficiency of the search process.

Numerical Results and Findings

In experimentation with both fixed and random temperature parameters, GENIUS discovered architectures with performance metrics consistently placing them among the upper echelons of the search space. For instance, when applied to NAS-Bench-Macro with a temperature parameter setting of zero, the framework resulted in architectures ranked as high as the top 0.12%.

More ambitiously, GENIUS achieved a 77.8% Top-1 accuracy on ImageNet with a resource constraint of 329M FLOPs, which signified an improvement upon the best existing architecture identified with similar constraints. Similarly, in settings allowing for slightly higher computational expense, GENIUS produced architectures with a 78.2% Top-1 accuracy.

Implications and Future Directions

The paper provides insights into how general-purpose LLMs like GPT-4 can serve as research tools requiring lesser domain-specific interventions, potentially democratizing access to complex tasks like NAS. Importantly, this reduces the reliance on heuristics traditionally necessitated by domain knowledge alone. The results highlight potential shifts towards automated scientific discovery tools that may require minimal human expertise to produce valuable outcomes.

However, it also raises questions about AI safety and ethical considerations. The possibility of AI systems autonomously proposing novel, high-performance architectures could lead to issues of dependency and lack of transparency––as researchers rely less on understanding and more on trusting a black-box AI system. It urges a cautionary approach towards safely integrating such advanced AI systems in the design and engineering domains.

Limitations and Considerations

Despite the promising results, limitations stem from the opaque nature of GPT-4's response generation processes, and unknowns regarding whether the model had prior exposure to the dataset benchmarks. Moreover, while the iterative process leverages feedback loops effectively, the absence of explicit control over GPT-4's decision-making processes may yield unpredictable results under certain circumstances. Finally, the extent to which revisions in architecture are genuinely innovative, as opposed to recompilations of existing knowledge, remains an open question worthy of further exploration.

In conclusion, while showing the capability of general-purpose models like GPT-4 in NAS tasks, the paper also calls for further research into model training data transparency, result reproducibility, and broader AI safety implications. This paper situates itself at the intersection of frontier AI capabilities and safe, responsible deployment, encouraging a discourse on how automated systems should be developed and integrated into scientific research workflows.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com