Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Parameter Efficient Tuning Allows Scalable Personalization of LLMs for Text Entry: A Case Study on Abbreviation Expansion (2312.14327v1)

Published 21 Dec 2023 in cs.CL

Abstract: Abbreviation expansion is a strategy used to speed up communication by limiting the amount of typing and using a LLM to suggest expansions. Here we look at personalizing a LLM's (LLM) suggestions based on prior conversations to enhance the relevance of predictions, particularly when the user data is small (~1000 samples). Specifically, we compare fine-tuning, prompt-tuning, and retrieval augmented generation of expanded text suggestions for abbreviated inputs. Our case study with a deployed 8B parameter LLM on a real user living with ALS, and experiments on movie character personalization indicates that (1) customization may be necessary in some scenarios and prompt-tuning generalizes well to those, (2) fine-tuning on in-domain data (with as few as 600 samples) still shows some gains, however (3) retrieval augmented few-shot selection also outperforms fine-tuning. (4) Parameter efficient tuning allows for efficient and scalable personalization. For prompt-tuning, we also find that initializing the learned "soft-prompts" to user relevant concept tokens leads to higher accuracy than random initialization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Accelerating text communication via abbreviated sentence input. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6574–6588.
  2. Language models are few-shot learners. arXiv preprint arXiv:2005.14165.
  3. Context-aware abbreviation expansion using large language models. arXiv preprint arXiv:2205.03767.
  4. Eye tracking communication devices in amyotrophic lateral sclerosis: impact on disability and quality of life. Amyotroph Lateral Scler Frontotemporal Degener, 14(7-8):546–552.
  5. Cristian Danescu-Niculescu-Mizil and Lillian Lee. 2011. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. arXiv preprint arXiv:1106.3077.
  6. Patrick W Demasco and Kathleen F McCoy. 1992. Generating text from compressed input: An intelligent interface for people with severe motor impairments. Communications of the ACM, 35(5):68–78.
  7. Structured abbreviation expansion in context. arXiv preprint arXiv:2110.01140.
  8. Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR.
  9. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
  10. "at times avuncular and cantankerous, with the reflexes of a mongoose": Understanding self-expression through augmentative and alternative communication devices. In Proceedings of CSCW 2017.
  11. Lamini. 2023. Guarantee valid json output with lamini. https://www.lamini.ai/blog/guarantee-valid-json-output-with-lamini. Accessed: 2023-12-21.
  12. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691.
  13. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190.
  14. Dailydialog: A manually labelled multi-turn dialogue dataset. arXiv preprint arXiv:1710.03957.
  15. Gpt understands, too. arXiv preprint arXiv:2103.10385.
  16. Augmented language models: a survey. arXiv preprint arXiv:2302.07842.
  17. Sentence-t5: Scalable sentence encoders from pre-trained text-to-text models. ACL.
  18. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318.
  19. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683.
  20. Recipes for building an open-domain chatbot. arXiv preprint arXiv:2004.13637.
  21. Learning to retrieve prompts for in-context learning. NAACL.
  22. Noam Shazeer and Mitchell Stern. 2018. Adafactor: Adaptive learning rates with sublinear memory cost. In International Conference on Machine Learning, pages 4596–4604. PMLR.
  23. Kwickchat: A multi-turn dialogue system for aac using context-aware sentence generation by bag-of-keywords. In 27th International Conference on Intelligent User Interfaces, pages 853–867.
  24. Stuart M Shieber and Rani Nelken. 2007. Abbreviated text input using language modeling. Natural Language Engineering, 13(2):165–183.
  25. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239.
  26. “the less i type, the better”: How ai language models can enhance or impede communication for aac users. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–14.
  27. Keith Vertanen. 2017. Towards improving predictive aac using crowdsourced dialogues and partner context. In Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility, pages 347–348.
  28. Keith Vertanen and Per Ola Kristensson. 2011. The imagination of crowds: conversational aac language modeling using crowdsourcing and large data sources. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 700–711.
  29. A probabilistic flexible abbreviation expansion system for users with motor disabilities. In Accessible Design in the Digital World Conference 2005, pages 1–9.
  30. Bruce Wisenburn and D Jeffery Higginbotham. 2008. An aac application using speaking partner speech recognition to automatically produce contextually relevant utterances: Objective results. Augmentative and Alternative Communication, 24(2):100–109.
  31. Calibrate before use: Improving few-shot performance of language models. arXiv preprint arXiv:2102.09690.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Katrin Tomanek (16 papers)
  2. Shanqing Cai (8 papers)
  3. Subhashini Venugopalan (35 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.