PEFTT: Parameter-Efficient Fine-Tuning for low-resource Tibetan pre-trained language models (2309.12109v1)
Abstract: In this era of LLMs, the traditional training of models has become increasingly unimaginable for regular users and institutions. The exploration of efficient fine-tuning for high-resource languages on these models is an undeniable trend that is gradually gaining popularity. However, there has been very little exploration for various low-resource languages, such as Tibetan. Research in Tibetan NLP is inherently scarce and limited. While there is currently no existing LLM for Tibetan due to its low-resource nature, that day will undoubtedly arrive. Therefore, research on efficient fine-tuning for low-resource LLMs like Tibetan is highly necessary. Our research can serve as a reference to fill this crucial gap. Efficient fine-tuning strategies for pre-trained LLMs (PLMs) in Tibetan have seen minimal exploration. We conducted three types of efficient fine-tuning experiments on the publicly available TNCC-title dataset: "prompt-tuning," "Adapter lightweight fine-tuning," and "prompt-tuning + Adapter fine-tuning." The experimental results demonstrate significant improvements using these methods, providing valuable insights for advancing Tibetan language applications in the context of pre-trained models.
- doi:10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423
- arXiv:1910.10683[cs,stat], doi:10.48550/arXiv.1910.10683. URL http://arxiv.org/abs/1910.10683
- arXiv:1902.00751[cs,stat]. URL http://arxiv.org/abs/1902.00751
- doi:10.18653/v1/2021.acl-long.353. URL https://aclanthology.org/2021.acl-long.353
- arXiv:2205.05638[cs], doi:10.48550/arXiv.2205.05638. URL http://arxiv.org/abs/2205.05638
- arXiv:2109.14076[cs], doi:10.48550/arXiv.2109.14076. URL http://arxiv.org/abs/2109.14076
- doi:10.18653/v1/2020.emnlp-main.617. URL https://www.aclweb.org/anthology/2020.emnlp-main.617
- doi:10.18653/v1/2020.coling-main.488. URL https://www.aclweb.org/anthology/2020.coling-main.488
- doi:10.18653/v1/2021.eacl-main.20. URL https://aclanthology.org/2021.eacl-main.20
- doi:10.18653/v1/2020.emnlp-main.346. URL https://www.aclweb.org/anthology/2020.emnlp-main.346
- doi:10.18653/v1/2021.acl-long.295. URL https://aclanthology.org/2021.acl-long.295
- arXiv:2102.12206[cs], doi:10.48550/arXiv.2102.12206. URL http://arxiv.org/abs/2102.12206
- doi:10.18653/v1/2021.emnlp-main.243. URL https://aclanthology.org/2021.emnlp-main.243
- doi:10.18653/v1/2022.acl-long.346. URL https://aclanthology.org/2022.acl-long.346
- doi:10.18653/v1/2022.acl-short.8. URL https://aclanthology.org/2022.acl-short.8
- doi:10.1109/SMC53654.2022.9945074. URL https://ieeexplore.ieee.org/document/9945074/
- doi:10.1145/3548608.3559255. URL https://doi.org/10.1145/3548608.3559255
- doi:10.18653/v1/2020.acl-main.747. URL https://aclanthology.org/2020.acl-main.747
- doi:10.18653/v1/2022.acl-demo.10. URL https://aclanthology.org/2022.acl-demo.10
- W. L. Taylor, “cloze procedure”: A new tool for measuring readability 30 (4) 415–433. doi:10.1177/107769905303000401. URL http://journals.sagepub.com/doi/10.1177/107769905303000401
- doi:10.18653/v1/2021.acl-long.568. URL https://aclanthology.org/2021.acl-long.568
- Zhou Mingjun (1 paper)
- Daiqing Zhuoma (1 paper)
- Qun Nuo (1 paper)
- Nyima Tashi (7 papers)