Personalized LoRA for Human-Centered Text Understanding (2403.06208v1)
Abstract: Effectively and efficiently adapting a pre-trained LLM (PLM) for human-centered text understanding (HCTU) is challenging since user tokens are million-level in most personalized applications and do not have concrete explicit semantics. A standard and parameter-efficient approach (e.g., LoRA) necessitates memorizing numerous suits of adapters for each user. In this work, we introduce a personalized LoRA (PLoRA) with a plug-and-play (PnP) framework for the HCTU task. PLoRA is effective, parameter-efficient, and dynamically deploying in PLMs. Moreover, a personalized dropout and a mutual information maximizing strategies are adopted and hence the proposed PLoRA can be well adapted to few/zero-shot learning scenarios for the cold-start issue. Experiments conducted on four benchmark datasets show that the proposed method outperforms existing methods in full/few/zero-shot learning scenarios for the HCTU task, even though it has fewer trainable parameters. For reproducibility, the code for this paper is available at: https://github.com/yoyo-yun/PLoRA.
- Amplayo, R. K. 2019. Rethinking Attribute Representation and Injection for Sentiment Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP-2019), 5601–5612. Association for Computational Linguistics.
- Longformer: The Long-Document Transformer. arXiv preprint arXiv:2004.05150.
- Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems (NIPS-2020), volume 33, 1877–1901.
- What is Human-Centered about Human-Centered AI? A Map of the Research Landscape. In Proceedings of the Conference on Human Factors in Computing Systems (CHI 2023), 1–23. ISBN 9781450394215.
- Neural Sentiment Classification with User and Product Attention. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2016), 1650–1659.
- Scaling Instruction-Finetuned Language Models. arXiv eprint arXiv: 2210.11416.
- Interactive Model Cards: A Human-Centered Approach to Model Documentation. In 2022 ACM Conference on Fairness, Accountability, and Transparency (FACCT 2022), 427–439. ISBN 9781450393522.
- Plug and Play Language Models: A Simple Approach to Controlled Text Generation. In International Conference on Learning Representations (ICLR 2020).
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019), 4171–4186.
- Parameter-Efficient Transfer Learning with Diff Pruning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021), 4884–4896.
- Parameter-Efficient Transfer Learning for NLP. In Proceedings of the 36th International Conference on Machine Learning (ICML-2019), volume 97, 2790–2799. PMLR.
- LoRA: Low-Rank Adaptation of Large Language Models. In International Conference on Learning Representations (ICLR 2021).
- Kim, Y. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP-2014), 1746–1751.
- Discriminative Clustering by Regularized Information Maximization. In Lafferty, J.; Williams, C.; Shawe-Taylor, J.; Zemel, R.; and Culotta, A., eds., Advances in Neural Information Processing Systems (NIPS), volume 23. Curran Associates, Inc.
- On information and sufficiency. The annals of mathematical statistics, 22(1): 79–86.
- The Power of Scale for Parameter-Efficient Prompt Tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP-2021), 3045–3059.
- Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP-2021), 4582–4597.
- Liu, B. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1): 1–167.
- Pre-train, Prompt and Recommendation: A Comprehensive Survey of Language Modelling Paradigm Adaptations in Recommender Systems. arXiv eprint arXiv:2302.03735.
- GPT Understands, Too. arXiv eprint arXiv: 2103.10385.
- RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.
- Human Centered NLP with User-Factor Adaptation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), 1146–1155. Stroudsburg, PA, USA.
- Generating Personalized Recipes from Historical User Preferences. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP-2019), 5975–5981.
- Distributed representations of words and phrases and their compositionality. In Proceedings of Advances in Neural Information Processing Systems (NIPS-2013), 3111–3119.
- Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey. arXiv eprint arXiv:2111.01243.
- A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10): 1345–1359.
- Instruction Tuning with GPT-4. arXiv eprint arXiv: 2304.03277.
- Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP-2014), 1532–1543.
- AdapterHub: A Framework for Adapting Transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP-2020), 46–54.
- Language Models are Unsupervised Multitask Learners.
- Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of machine learning research, 15(1): 1929–1958.
- Black-Box Tuning for Language-Model-as-a-Service. In Proceedings of the 39th International Conference on Machine Learning (ICML-2022), volume 162, 20841–20855.
- Learning Semantic Representations of Users and Products for Document Level Sentiment Classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-2015), 1014–1023.
- Item recommendation on monotonic behavior chains. In Proceedings of the 12th ACM Conference on Recommender Systems (RecSys 2018), 86–94. ISBN 9781450359016.
- Using a stacked residual LSTM model for sentiment intensity prediction. Neurocomputing, 322(17): 93–101.
- Personalized Prompt for Sequential Recommendation. arXiv eprint arXiv:2205.09666.
- Improving review representations with user attention and product attention for sentiment classification. In Proceedings of The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 5989–5996. ISBN 9781577358008.
- Encoding Syntactic Information into Transformers for Aspect-Based Sentiment Triplet Extraction. IEEE Transactions on Affective Computing, 1–15.
- Domain Generalization via Switch Knowledge Distillation for Robust Review Representation. In Findings of the Association for Computational Linguistics (ACL-2023), 12812–12826.
- MA-BERT: Learning Representation by Incorporating Multi-Attribute Knowledge in Transformers. In Findings of the Association for Computational Linguistics (ACL-IJCNLP 2021), 2338–2343.
- Personalized sentiment classification of customer reviews via an interactive attributes attention model. Knowledge-Based Systems, 226: 107135.
- Plug-and-Play Knowledge Injection for Pre-trained Language Models. arXiv eprint arXiv:2305.17691.
- UserAdapter: Few-shot user learning in sentiment analysis. In Findings of the Association for Computational Linguistics (ACL-IJCNLP-2021), 1484–1488.
- Text classification based on gated recurrent unit combines with support vector machine. International Journal of Electrical and Computer Engineering (IJECE), 10(4): 3734.