ModelGPT: Unleashing LLM's Capabilities for Tailored Model Generation (2402.12408v1)

Published 18 Feb 2024 in cs.LG, cs.AI, and cs.CL

Abstract: The rapid advancement of LLMs has revolutionized various sectors by automating routine tasks, marking a step toward the realization of AGI. However, they still struggle to accommodate the diverse and specific needs of users and simplify the utilization of AI models for the average user. In response, we propose ModelGPT, a novel framework designed to determine and generate AI models specifically tailored to the data or task descriptions provided by the user, leveraging the capabilities of LLMs. Given user requirements, ModelGPT is able to provide tailored models at most 270x faster than the previous paradigms (e.g. all-parameter or LoRA finetuning). Comprehensive experiments on NLP, CV, and Tabular datasets attest to the effectiveness of our framework in making AI models more accessible and user-friendly. Our code is available at https://github.com/IshiKura-a/ModelGPT.