SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer (2110.07904v2)

Published 15 Oct 2021 in cs.CL

Abstract: There has been growing interest in parameter-efficient methods to apply pre-trained LLMs to downstream tasks. Building on the Prompt Tuning approach of Lester et al. (2021), which learns task-specific soft prompts to condition a frozen pre-trained model to perform different tasks, we propose a novel prompt-based transfer learning approach called SPoT: Soft Prompt Transfer. SPoT first learns a prompt on one or more source tasks and then uses it to initialize the prompt for a target task. We show that SPoT significantly boosts the performance of Prompt Tuning across many tasks. More remarkably, across all model sizes, SPoT matches or outperforms standard Model Tuning (which fine-tunes all model parameters) on the SuperGLUE benchmark, while using up to 27,000x fewer task-specific parameters. To understand where SPoT is most effective, we conduct a large-scale study on task transferability with 26 NLP tasks in 160 combinations, and demonstrate that many tasks can benefit each other via prompt transfer. Finally, we propose an efficient retrieval approach that interprets task prompts as task embeddings to identify similar tasks and predict the most transferable source tasks for a novel target task.

PDF Abstract

Soft Prompt Transfer: Advancements in Frozen Model Adaptation

The paper "SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer" introduces a sophisticated method for enhancing the efficacy of prompt-based learning in LLMs without full parameter tuning, titled Soft Prompt Transfer (SPoT). By leveraging a novel transfer learning strategy within the context of prompt tuning, this work achieves significant strides in closed-vocabulary NLP tasks.

Key Contributions

The authors present SPoT as an advancement over traditional PromptTuning. This involves training a soft prompt on one or more source tasks and transferring the learned parameters to a new target task—a process that notably improves the performance of frozen-LLMs across various tasks.

Efficiency and Performance: SPoT is shown to be not only competitive but in many cases superior to ModelTuning despite involving significantly fewer parameters. On the SuperGLUE benchmark, SPoT achieves comparable or superior results to full model tuning, especially at the large model scale, with up to a 27,000× reduction in task-specific parameters.
Study on Task Transferability: A comprehensive paper of task transferability across 26 NLP tasks reveals that many tasks can provide substantial benefit to each other via prompt transfer. This large-scale analysis involving 160 task combinations highlights that tasks, even seemingly unrelated ones, can mutually benefit from transfer learning.
Prediction of Transferability: The introduction of an efficient retrieval method capitalizes on interpreting task prompts as task embeddings. By constructing a semantic space of tasks, this approach allows for the prediction of which tasks are most likely to confer benefits onto a novel target task—streamlining the selection process for source tasks in transfer settings.

Implications and Future Directions

The findings of this paper have significant implications both theoretically and practically. Theoretically, this work expands the horizons of prompt-based learning by endorsing parameter efficiency in adapting large pre-trained models, breaking the bind of conventional thoughts tied to complete model fine-tuning. Practically, the method demonstrates a considerable resource and computational advantage in deploying large models, making their application more feasible.

One anticipated future development is the application of similar transfer strategies in other adaptation methods that prioritize parameter efficiency, such as adapter tuning or prefix tuning, to further generalize and optimize model deployment. Additionally, refining task embeddings to better capture domain and task-specific nuances may lead to even more profound transferability insights. This work also sets the stage for exploring combinations of soft prompt parameters with sparse task-specific adaptations for more effective task generalization.

Conclusion

Through Soft Prompt Transfer, this paper not only narrows the gap between parameter-efficient tuning and comprehensive model tuning but also establishes a robust framework for predictive task transferability. The results indicate promising pathways for deploying large, frozen LLMs effectively across diverse NLP tasks, broadening the applications of deep learning innovations in resource-efficient computing environments.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Tu Vu (24 papers)
Brian Lester (21 papers)
Noah Constant (32 papers)
Rami Al-Rfou (34 papers)
Daniel Cer (28 papers)

Citations (254)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/Umberto_Senpai/status/1749834507908194781