MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning (2402.17263v2)
Abstract: Parameter-efficient fine-tuning (PEFT) is a popular method for tailoring pre-trained LLMs, especially as the models' scale and the diversity of tasks increase. Low-rank adaptation (LoRA) is based on the idea that the adaptation process is intrinsically low-dimensional, i.e., significant model changes can be represented with relatively few parameters. However, decreasing the rank encounters challenges with generalization errors for specific tasks when compared to full-parameter fine-tuning. We present MELoRA, a mini-ensemble low-rank adapters that uses fewer trainable parameters while maintaining a higher rank, thereby offering improved performance potential. The core idea is to freeze original pretrained weights and train a group of mini LoRAs with only a small number of parameters. This can capture a significant degree of diversity among mini LoRAs, thus promoting better generalization ability. We conduct a theoretical analysis and empirical studies on various NLP tasks. Our experimental results show that, compared to LoRA, MELoRA achieves better performance with 8 times fewer trainable parameters on natural language understanding tasks and 36 times fewer trainable parameters on instruction following tasks, which demonstrates the effectiveness of MELoRA.
- Benchmarking applied semantic inference: The PASCAL recognising textual entailment challenges. In Language, Culture, Computation. Computing - Theory and Technology, volume 8001 of Lecture Notes in Computer Science, pages 409–424. Springer.
- The fifth pascal recognizing textual entailment challenge. TAC, 7:8.
- The fifth PASCAL recognizing textual entailment challenge. In TAC. NIST.
- Language models are few-shot learners. In NeurIPS.
- Semeval-2017 task 1: Semantic textual similarity - multilingual and cross-lingual focused evaluation. CoRR, abs/1708.00055.
- Evaluating large language models trained on code. CoRR, abs/2107.03374.
- INSTRUCTEVAL: towards holistic evaluation of instruction-tuned large language models. CoRR, abs/2306.04757.
- Qlora: Efficient finetuning of quantized llms. CoRR, abs/2305.14314.
- Sparse low-rank adaptation of pre-trained language models. In EMNLP, pages 4133–4145, Singapore. Association for Computational Linguistics.
- Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 5(3):220–235.
- William B. Dolan and Chris Brockett. 2005. Automatically constructing a corpus of sentential paraphrases. In IJCNLP-IWP. Asian Federation of Natural Language Processing.
- Loramoe: Revolutionizing mixture of experts for maintaining world knowledge in language model alignment. CoRR, abs/2312.09979.
- DROP: A reading comprehension benchmark requiring discrete reasoning over paragraphs. In NAACL.
- The third PASCAL recognizing textual entailment challenge. In ACL-PASCAL, pages 1–9, Prague. Association for Computational Linguistics.
- Measuring massive multitask language understanding. In ICLR.
- Parameter-efficient transfer learning for NLP. In ICML, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR.
- Lora: Low-rank adaptation of large language models. In ICLR. OpenReview.net.
- Scaling laws for neural language models. CoRR, abs/2001.08361.
- Vera: Vector-based random matrix adaptation. CoRR, abs/2310.11454.
- Neural architecture search for parameter-efficient fine-tuning of large pre-trained language models. In Findings of ACL, pages 8506–8515, Toronto, Canada. Association for Computational Linguistics.
- The power of scale for parameter-efficient prompt tuning. In EMNLP, pages 3045–3059. Association for Computational Linguistics.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In ACL-IJCNLP, pages 4582–4597. Association for Computational Linguistics.
- Stack more layers differently: High-rank training through low-rank updates. CoRR, abs/2307.05695.
- Adapterfusion: Non-destructive task composition for transfer learning. In EACL, pages 487–503. Association for Computational Linguistics.
- Squad: 100, 000+ questions for machine comprehension of text. In EMNLP, pages 2383–2392. The Association for Computational Linguistics.
- Learning multiple visual domains with residual adapters. In NeurIPS, pages 506–516.
- Adapterdrop: On the efficiency of adapters in transformers. In EMNLP, pages 7930–7946. Association for Computational Linguistics.
- Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP, pages 1631–1642. ACL.
- Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research.
- Dylora: Parameter-efficient tuning of pre-trained models using dynamic search-free low-rank adaptation. In EACL, pages 3266–3279. Association for Computational Linguistics.
- GLUE: A multi-task benchmark and analysis platform for natural language understanding. In ICLR. OpenReview.net.
- Neural network acceptability judgments. Trans. Assoc. Comput. Linguistics, 7:625–641.
- A broad-coverage challenge corpus for sentence understanding through inference. In NAACL-HLT, pages 1112–1122. Association for Computational Linguistics.
- Chain of lora: Efficient fine-tuning of language models via residual learning. CoRR, abs/2401.04151.
- Increlora: Incremental parameter allocation method for parameter-efficient fine-tuning. CoRR, abs/2308.12043.
- Lora-fa: Memory-efficient low-rank adaptation for large language models fine-tuning. CoRR, abs/2308.03303.
- Adaptive budget allocation for parameter-efficient fine-tuning. In ICLR.
- Delta-LoRA: Fine-tuning high-rank parameters with the delta of low-rank matrices.
- Pengjie Ren (95 papers)
- Chengshun Shi (1 paper)
- Shiguang Wu (15 papers)
- Mengqi Zhang (48 papers)
- Zhaochun Ren (117 papers)
- Maarten de Rijke (261 papers)
- Zhumin Chen (78 papers)
- Jiahuan Pei (16 papers)