Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion (2405.11464v3)

Published 19 May 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Prompt tuning is a promising method to fine-tune a pre-trained LLM without retraining its large-scale parameters. Instead, it attaches a soft prompt to the input text, whereby downstream tasks can be well adapted by merely learning the embeddings of prompt tokens. Nevertheless, existing methods still suffer from two challenges: (i) they are hard to balance accuracy and efficiency. A longer (shorter) soft prompt generally leads to a better(worse) accuracy but at the cost of more (less) training time. (ii)The performance may not be consistent when adapting to different downstream tasks. We attribute it to the same embedding space but responsible for different requirements of downstream tasks. To address these issues, we propose an Efficient Prompt Tuning method (EPT) by multi-space projection and prompt fusion. Specifically, it decomposes a given soft prompt into a shorter prompt and two low-rank matrices, significantly reducing the training time. Accuracy is also enhanced by leveraging low-rank matrices and the short prompt as additional knowledge sources to enrich the semantics of the original short prompt. In addition, we project the soft prompt into multiple subspaces to improve the performance consistency, and then adaptively learn the combination weights of different spaces through a gating network. Experiments on 13 natural language processing downstream tasks show that our method significantly and consistently outperforms 11 comparison methods with the relative percentage of improvements up to 12.9%, and training time decreased by 14%.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Attempt: Parameter-efficient multi-task tuning via attentional mixtures of soft prompts. In EMNLP 2022, pages 6655–6672, 2022.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  3. Semeval-2017 task 1: Semantic textual similarity multilingual and cross-lingual focused evaluation. In SemEval 2017, pages 1–14, 2017.
  4. Parameter-efficient fine-tuning design spaces. In ICLR 2022, 2022.
  5. Boolq: Exploring the surprising difficulty of natural yes/no questions. In NAACL-HLT 2019, pages 2924–2936, 2019.
  6. The commitmentbank: Investigating projection in naturally occurring discourse. In SuB, volume 23, pages 107–124, 2019.
  7. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT 2019, pages 4171–4186, 2019.
  8. B. Dolan and C. Brockett. Automatically constructing a corpus of sentential paraphrases. In IWP 2005, 2005.
  9. Krona: Parameter efficient tuning with kronecker adapter. arXiv preprint arXiv:2212.10650, 2022.
  10. The third pascal recognizing textual entailment challenge. In Proceedings of the ACL-PASCAL workshop on textual entailment and paraphrasing, pages 1–9, 2007.
  11. Parameter-efficient transfer learning with diff pruning. In ACL-IJCNLP 2021, pages 4884–4896, 2021.
  12. Towards a unified view of parameter-efficient transfer learning. arXiv preprint arXiv:2110.04366, 2021.
  13. Parameter-efficient transfer learning for nlp. In ICML 2019, pages 2790–2799, 2019.
  14. Lora: Low-rank adaptation of large language models. In ICLR 2021, 2021.
  15. H. Ivison and M. E. Peters. Hyperdecoders: Instance-specific decoders for multi-task nlp. In EMNLP 2022, pages 1715–1730, 2022.
  16. Adaptive mixtures of local experts. Neural computation, 3(1):79–87, 1991.
  17. Looking beyond the surface: A challenge set for reading comprehension over multiple sentences. In NAACL-HLT 2018, pages 252–262, 2018.
  18. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  19. The power of scale for parameter-efficient prompt tuning. In EMNLP 2021, pages 3045–3059, 2021.
  20. The winograd schema challenge. In KR 2012, 2012.
  21. Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks. In ACL-IJCNLP 2021, pages 565–576, 2021.
  22. M. T. Pilehvar and J. Camacho-Collados. Wic: the word-in-context dataset for evaluating context-sensitive meaning representations. In NAACL-HLT 2019, pages 1267–1273, 2019.
  23. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  24. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67, 2020.
  25. Squad: 100,000+ questions for machine comprehension of text. In EMNLP 2016, pages 2383–2392, 2016.
  26. Residual prompt tuning: improving prompt tuning with residual reparameterization. In ACL 2023, pages 6740–6757, 2023.
  27. Adapterdrop: On the efficiency of adapters in transformers. In EMNLP 2021, pages 7930–7946, 2021.
  28. Z. Shi and A. Lipani. Dept: Decomposed prompt tuning for parameter-efficient fine-tuning. arXiv preprint arXiv:2309.05173, 2023.
  29. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP 2013, pages 1631–1642, 2013.
  30. On transferability of prompt tuning for natural language processing. In NAACL-HLT 2022, pages 3949–3969, 2022.
  31. Lst: Ladder side-tuning for parameter and memory efficient transfer learning. Advances in Neural Information Processing Systems, 35:12991–13005, 2022.
  32. Attention is all you need. In NeurIPS 2017, pages 6000–6010, 2017.
  33. Spot: Better frozen model adaptation through soft prompt transfer. In ACL 2022, pages 5039–5059, 2022.
  34. Glue: A multi-task benchmark and analysis platform for natural language understanding. In EMNLP 2018, pages 353–355, 2018.
  35. Superglue: a stickier benchmark for general-purpose language understanding systems. In NeurIPS 2019, pages 3266–3280, 2019.
  36. Multitask prompt tuning enables parameter-efficient transfer learning. In ICLR 2022, 2022.
  37. Neural network acceptability judgments. Transactions of the Association for Computational Linguistics, 7:625–641, 2019.
  38. A broad-coverage challenge corpus for sentence understanding through inference. In NAACL-HLT 2018, pages 1112–1122, 2018.
  39. Decomposed prompt tuning via low-rank reparameterization. In EMNLP 2023, 2023.
  40. Bitfit: Simple parameter-efficient fine-tuning for transformer-based masked language-models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1–9, 2022.
  41. Record: Bridging the gap between human and machine commonsense reading comprehension. arXiv preprint arXiv:1810.12885, 2018.
  42. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Pengxiang Lan (3 papers)
  2. Enneng Yang (24 papers)
  3. Yuting Liu (62 papers)
  4. Guibing Guo (35 papers)
  5. Jianzhe Zhao (14 papers)
  6. Xingwei Wang (35 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets