Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning (2402.17263v2)

Published 27 Feb 2024 in cs.CL

Abstract: Parameter-efficient fine-tuning (PEFT) is a popular method for tailoring pre-trained LLMs, especially as the models' scale and the diversity of tasks increase. Low-rank adaptation (LoRA) is based on the idea that the adaptation process is intrinsically low-dimensional, i.e., significant model changes can be represented with relatively few parameters. However, decreasing the rank encounters challenges with generalization errors for specific tasks when compared to full-parameter fine-tuning. We present MELoRA, a mini-ensemble low-rank adapters that uses fewer trainable parameters while maintaining a higher rank, thereby offering improved performance potential. The core idea is to freeze original pretrained weights and train a group of mini LoRAs with only a small number of parameters. This can capture a significant degree of diversity among mini LoRAs, thus promoting better generalization ability. We conduct a theoretical analysis and empirical studies on various NLP tasks. Our experimental results show that, compared to LoRA, MELoRA achieves better performance with 8 times fewer trainable parameters on natural language understanding tasks and 36 times fewer trainable parameters on instruction following tasks, which demonstrates the effectiveness of MELoRA.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Benchmarking applied semantic inference: The PASCAL recognising textual entailment challenges. In Language, Culture, Computation. Computing - Theory and Technology, volume 8001 of Lecture Notes in Computer Science, pages 409–424. Springer.
  2. The fifth pascal recognizing textual entailment challenge. TAC, 7:8.
  3. The fifth PASCAL recognizing textual entailment challenge. In TAC. NIST.
  4. Language models are few-shot learners. In NeurIPS.
  5. Semeval-2017 task 1: Semantic textual similarity - multilingual and cross-lingual focused evaluation. CoRR, abs/1708.00055.
  6. Evaluating large language models trained on code. CoRR, abs/2107.03374.
  7. INSTRUCTEVAL: towards holistic evaluation of instruction-tuned large language models. CoRR, abs/2306.04757.
  8. Qlora: Efficient finetuning of quantized llms. CoRR, abs/2305.14314.
  9. Sparse low-rank adaptation of pre-trained language models. In EMNLP, pages 4133–4145, Singapore. Association for Computational Linguistics.
  10. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 5(3):220–235.
  11. William B. Dolan and Chris Brockett. 2005. Automatically constructing a corpus of sentential paraphrases. In IJCNLP-IWP. Asian Federation of Natural Language Processing.
  12. Loramoe: Revolutionizing mixture of experts for maintaining world knowledge in language model alignment. CoRR, abs/2312.09979.
  13. DROP: A reading comprehension benchmark requiring discrete reasoning over paragraphs. In NAACL.
  14. The third PASCAL recognizing textual entailment challenge. In ACL-PASCAL, pages 1–9, Prague. Association for Computational Linguistics.
  15. Measuring massive multitask language understanding. In ICLR.
  16. Parameter-efficient transfer learning for NLP. In ICML, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR.
  17. Lora: Low-rank adaptation of large language models. In ICLR. OpenReview.net.
  18. Scaling laws for neural language models. CoRR, abs/2001.08361.
  19. Vera: Vector-based random matrix adaptation. CoRR, abs/2310.11454.
  20. Neural architecture search for parameter-efficient fine-tuning of large pre-trained language models. In Findings of ACL, pages 8506–8515, Toronto, Canada. Association for Computational Linguistics.
  21. The power of scale for parameter-efficient prompt tuning. In EMNLP, pages 3045–3059. Association for Computational Linguistics.
  22. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In ACL-IJCNLP, pages 4582–4597. Association for Computational Linguistics.
  23. Stack more layers differently: High-rank training through low-rank updates. CoRR, abs/2307.05695.
  24. Adapterfusion: Non-destructive task composition for transfer learning. In EACL, pages 487–503. Association for Computational Linguistics.
  25. Squad: 100, 000+ questions for machine comprehension of text. In EMNLP, pages 2383–2392. The Association for Computational Linguistics.
  26. Learning multiple visual domains with residual adapters. In NeurIPS, pages 506–516.
  27. Adapterdrop: On the efficiency of adapters in transformers. In EMNLP, pages 7930–7946. Association for Computational Linguistics.
  28. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP, pages 1631–1642. ACL.
  29. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research.
  30. Dylora: Parameter-efficient tuning of pre-trained models using dynamic search-free low-rank adaptation. In EACL, pages 3266–3279. Association for Computational Linguistics.
  31. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In ICLR. OpenReview.net.
  32. Neural network acceptability judgments. Trans. Assoc. Comput. Linguistics, 7:625–641.
  33. A broad-coverage challenge corpus for sentence understanding through inference. In NAACL-HLT, pages 1112–1122. Association for Computational Linguistics.
  34. Chain of lora: Efficient fine-tuning of language models via residual learning. CoRR, abs/2401.04151.
  35. Increlora: Incremental parameter allocation method for parameter-efficient fine-tuning. CoRR, abs/2308.12043.
  36. Lora-fa: Memory-efficient low-rank adaptation for large language models fine-tuning. CoRR, abs/2308.03303.
  37. Adaptive budget allocation for parameter-efficient fine-tuning. In ICLR.
  38. Delta-LoRA: Fine-tuning high-rank parameters with the delta of low-rank matrices.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Pengjie Ren (95 papers)
  2. Chengshun Shi (1 paper)
  3. Shiguang Wu (15 papers)
  4. Mengqi Zhang (48 papers)
  5. Zhaochun Ren (117 papers)
  6. Maarten de Rijke (261 papers)
  7. Zhumin Chen (78 papers)
  8. Jiahuan Pei (16 papers)
Citations (7)