Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Black-Box Tuning for Language-Model-as-a-Service (2201.03514v4)

Published 10 Jan 2022 in cs.CL and cs.AI

Abstract: Extremely large pre-trained LLMs (PTMs) such as GPT-3 are usually released as a service. It allows users to design task-specific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable. Can we optimize the task prompts by only accessing the model inference APIs? This paper proposes the black-box tuning framework to optimize the continuous prompt prepended to the input text via derivative-free optimization. Instead of optimizing in the original high-dimensional prompt space, which is intractable for traditional derivative-free optimization, we perform optimization in a randomly generated subspace due to the low intrinsic dimensionality of large PTMs. The experimental results show that the black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3's in-context learning, but also surpasses the gradient-based counterparts, i.e., prompt tuning and full model tuning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Tianxiang Sun (35 papers)
  2. Yunfan Shao (19 papers)
  3. Hong Qian (90 papers)
  4. Xuanjing Huang (287 papers)
  5. Xipeng Qiu (257 papers)
Citations (230)

Summary

Black-Box Tuning for Language-Model-as-a-Service

In the paper "Black-Box Tuning for Language-Model-as-a-Service," the authors propose a novel approach to optimize continuous prompts for large pre-trained LLMs (PTMs) offered as a service without requiring access to model gradients. This paradigm, termed Language-Model-as-a-Service (LMaaS), necessitates efficient and effective methods to tailor LLMs such as GPT-3, ERNIE, and Yuan 1.0 for diverse downstream tasks via black-box APIs.

The primary innovation of the work is the Black-Box Tuning (BBT) framework which employs derivative-free optimization (DFO) techniques to optimize continuous prompts. This approach addresses the inherent limitations of traditional gradient-based methods that are not feasible in black-box scenarios typical of LMaaS. The authors exploit the low intrinsic dimensionality of PTMs, projecting the high-dimensional prompt space onto a lower-dimensional subspace using random linear embeddings. This reparameterization is crucial to make the tuning tractable with DFO in high-dimensional settings.

The proposed methodology is rigorously evaluated across multiple NLP datasets, demonstrating its effectiveness. Notably, the BBT framework, when deployed with RoBERTaLARGE, consistently outperforms manual prompting, GPT-3 in-context learning, as well as its gradient-based counterparts such as prompt tuning and full model tuning, particularly in few-shot scenarios. The empirical results are compelling, as BBT successfully surpasses manual and in-context learning by a substantial margin on various tasks like sentiment analysis, topic classification, and natural language inference, showcasing performance superior to even conventional full model tuning in many instances.

The implications of this research are multifaceted. Practically, BBT facilitates the optimal exploitation of existing large-scale PTMs for a broader range of users who may not have the computational resources to conduct full model training. This enhances scalability and accessibility, as users can locally optimize their task-specific prompts on resource-limited devices without the need for GPU support. Theoretically, this work supports the notion of low intrinsic dimensionality in PTMs, providing a foundation for further exploration into efficient parameter space modification techniques.

Additionally, this research opens potential avenues for future developments, such as combining BBT with prompt engineering strategies and integrating advanced methods for generating projection matrices. The exploration of BBT in conjunction with generative large-scale PTMs like GPT or T5 represents a promising future direction that could broaden its applicability across more complex NLP tasks.

In conclusion, "Black-Box Tuning for Language-Model-as-a-Service" presents a significant advancement in optimizing PTMs in a derivative-free manner within a black-box setting. The detailed experimental analysis and superior performance across diverse benchmarks highlight the potential of BBT to facilitate the widespread application of powerful PTMs in a cost-effective and accessible manner.