Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AutoML-GPT: Automatic Machine Learning with GPT (2305.02499v1)

Published 4 May 2023 in cs.CL, cs.AI, cs.CV, cs.LG, and stat.ML

Abstract: AI tasks encompass a wide range of domains and fields. While numerous AI models have been designed for specific tasks and applications, they often require considerable human efforts in finding the right model architecture, optimization algorithm, and hyperparameters. Recent advances in LLMs like ChatGPT show remarkable capabilities in various aspects of reasoning, comprehension, and interaction. Consequently, we propose developing task-oriented prompts and automatically utilizing LLMs to automate the training pipeline. To implement this concept, we present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyperparameters. AutoML-GPT dynamically takes user requests from the model and data cards and composes the corresponding prompt paragraph. Ultimately, with this prompt paragraph, AutoML-GPT will automatically conduct the experiments from data processing to model architecture, hyperparameter tuning, and predicted training log. By leveraging {\ours}'s robust language capabilities and the available AI models, AutoML-GPT can tackle numerous intricate AI tasks across various tasks and datasets. This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas. Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many AI tasks.

AutoML-GPT: Innovations in Automated Machine Learning Utilizing GPT

AutoML-GPT presents an approach to improve the automatic training of diverse AI models using LLMs such as GPT. This paper aims at addressing the significant human effort typically involved in manually tuning hyperparameters, selecting model architectures, and optimization algorithms across various AI tasks. Recent advancements in LLMs, exemplified by ChatGPT, have demonstrated strong capabilities in language reasoning, comprehension, and interaction, facilitating the design of systems like AutoML-GPT that exploit these properties.

The proposed AutoML-GPT system leverages LLMs as a dynamic conduit, integrating task-oriented prompts for automated model training pipelines. This integration allows the system to automatically process data inputs, design model architectures, tune hyperparameters, and generate predicted training logs based on user requests conveyed through prompt paragraphs. These prompt paragraphs are composed using model and data cards, effectively forming a language-centric interface that connects diverse models and facilitates handling intricate AI tasks across different domains such as computer vision and natural language processing.

In the proposed framework, AutoML-GPT operates through four primary stages: 1) Data processing, which entails preparing raw data for analysis depending on the nature of the problem, utilizing techniques such as normalization and augmentation in computer vision tasks. 2) Model architecture design, dynamically assigning optimal models to specific tasks and adapting settings based on user-provided descriptions from model cards, ensuring adaptability and openness. 3) Hyperparameter tuning with the generation of predicted training logs for optimal model configurations without physical machine resources. 4) Incorporating human feedback to refine hyperparameters based on the iterative training logs, optimizing model performances for user-specified constraints and metrics.

The framework also details an innovative method for optimizing models for unseen datasets, employing cross-dataset hyperparameter transfer facilitated by text-encoded similarity scores derived from metadata and prior training logs. This method demonstrates high prediction accuracy when tuning hyperparameters for new datasets, exemplifying the utility of leveraging existing benchmark comparisons for optimizing unseen data.

Experimental evaluations show AutoML-GPT's robustness in flexibly managing tasks across various domains in AI, demonstrating efficacy in handling computer vision, language processing, and classification tasks. The model’s ability to automatically conduct experiments via generated training logs further evidences its potential in reducing human workload associated with manual hyperparameter tuning and selection processes.

This research advances the use of LLMs in AI system design, suggesting broader implications such as improved efficiency in hyperparameter tuning and increased model accuracy across multiple benchmarks. Future developments may involve automating model and data card generation for established datasets and exploring the systematic extraction of task-relevant sub-networks from larger models, further enhancing AutoML-GPT's scalability and application breadth.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shujian Zhang (28 papers)
  2. Chengyue Gong (30 papers)
  3. Lemeng Wu (29 papers)
  4. Xingchao Liu (28 papers)
  5. Mingyuan Zhou (161 papers)
Citations (50)
Youtube Logo Streamline Icon: https://streamlinehq.com