Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 162 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 37 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 72 tok/s Pro
Kimi K2 174 tok/s Pro
GPT OSS 120B 433 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Skeleton: A New Framework for Accelerating Language Models via Task Neuron Localized Prompt Tuning (2404.11916v2)

Published 18 Apr 2024 in cs.CL and cs.AI

Abstract: Prompt tuning methods have shown comparable performance to general training methods as parameter-efficient fine-tuning (PEFT) methods in various natural language understanding tasks. However, existing prompt tuning methods still utilize the entire model architecture even when solving a specific task, which prevents them from accelerating inference speed during the application procedure. In this paper, we propose a novel prompt tuning framework called Skeleton to efficiently utilize a LLM in terms of memory and time complexity for solving various tasks, retaining only task-relevant neurons by using an explainability method. From our framework, we can efficiently solve various tasks by using only task-relevant neurons and prepending adequate task-specific prompt tokens with only a single LLM. Experiments reveal that our method significantly enhances inference efficiency (at most x 1.73 speed up) for various widely used benchmarks, showing comparable performance to the prompt tuning method. Moreover, our method is applicable across various transformer-based architectures, confirming its practicality and scalability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Few-shot unified question answering: Tuning models or prompts? arXiv preprint arXiv:2305.14569.
  2. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  3. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691.
  4. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
  5. Prompts can play lottery tickets well: Achieving lifelong information extraction via lottery prompt tuning.
  6. P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks. arXiv preprint arXiv:2110.07602.
  7. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  8. Xprompt: Exploring the extreme of prompt tuning. arXiv preprint arXiv:2210.04457.
  9. Learning word vectors for sentiment analysis. Association for Computational Linguistics.
  10. Task-specific skill localization in fine-tuned language models. arXiv preprint arXiv:2302.06600.
  11. Not just a black box: Interpretable deep learning by propagating activation differences. arXiv preprint arXiv:1605.01713, 4.
  12. Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461.
  13. Fine-grained retrieval prompt tuning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 2644–2652.
  14. Task-specific compression for multi-task language models using attribution-based pruning. In Findings of the Association for Computational Linguistics: EACL 2023, pages 582–592.
  15. Character-level convolutional networks for text classification. Advances in neural information processing systems, 28.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.