AutoML-GPT: Innovations in Automated Machine Learning Utilizing GPT
AutoML-GPT presents an approach to improve the automatic training of diverse AI models using LLMs such as GPT. This paper aims at addressing the significant human effort typically involved in manually tuning hyperparameters, selecting model architectures, and optimization algorithms across various AI tasks. Recent advancements in LLMs, exemplified by ChatGPT, have demonstrated strong capabilities in language reasoning, comprehension, and interaction, facilitating the design of systems like AutoML-GPT that exploit these properties.
The proposed AutoML-GPT system leverages LLMs as a dynamic conduit, integrating task-oriented prompts for automated model training pipelines. This integration allows the system to automatically process data inputs, design model architectures, tune hyperparameters, and generate predicted training logs based on user requests conveyed through prompt paragraphs. These prompt paragraphs are composed using model and data cards, effectively forming a language-centric interface that connects diverse models and facilitates handling intricate AI tasks across different domains such as computer vision and natural language processing.
In the proposed framework, AutoML-GPT operates through four primary stages: 1) Data processing, which entails preparing raw data for analysis depending on the nature of the problem, utilizing techniques such as normalization and augmentation in computer vision tasks. 2) Model architecture design, dynamically assigning optimal models to specific tasks and adapting settings based on user-provided descriptions from model cards, ensuring adaptability and openness. 3) Hyperparameter tuning with the generation of predicted training logs for optimal model configurations without physical machine resources. 4) Incorporating human feedback to refine hyperparameters based on the iterative training logs, optimizing model performances for user-specified constraints and metrics.
The framework also details an innovative method for optimizing models for unseen datasets, employing cross-dataset hyperparameter transfer facilitated by text-encoded similarity scores derived from metadata and prior training logs. This method demonstrates high prediction accuracy when tuning hyperparameters for new datasets, exemplifying the utility of leveraging existing benchmark comparisons for optimizing unseen data.
Experimental evaluations show AutoML-GPT's robustness in flexibly managing tasks across various domains in AI, demonstrating efficacy in handling computer vision, language processing, and classification tasks. The model’s ability to automatically conduct experiments via generated training logs further evidences its potential in reducing human workload associated with manual hyperparameter tuning and selection processes.
This research advances the use of LLMs in AI system design, suggesting broader implications such as improved efficiency in hyperparameter tuning and increased model accuracy across multiple benchmarks. Future developments may involve automating model and data card generation for established datasets and exploring the systematic extraction of task-relevant sub-networks from larger models, further enhancing AutoML-GPT's scalability and application breadth.