JMedLoRA:Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning (2310.10083v2)
Abstract: In the ongoing wave of impact driven by LLMs like ChatGPT, the adaptation of LLMs to medical domain has emerged as a crucial research frontier. Since mainstream LLMs tend to be designed for general-purpose applications, constructing a medical LLM through domain adaptation is a huge challenge. While instruction-tuning is used to fine-tune some LLMs, its precise roles in domain adaptation remain unknown. Here we show the contribution of LoRA-based instruction-tuning to performance in Japanese medical question-answering tasks. In doing so, we employ a multifaceted evaluation for multiple-choice questions, including scoring based on "Exact match" and "Gestalt distance" in addition to the conventional accuracy. Our findings suggest that LoRA-based instruction-tuning can partially incorporate domain-specific knowledge into LLMs, with larger models demonstrating more pronounced effects. Furthermore, our results underscore the potential of adapting English-centric models for Japanese applications in domain adaptation, while also highlighting the persisting limitations of Japanese-centric models. This initiative represents a pioneering effort in enabling medical institutions to fine-tune and operate models without relying on external services.
- Qlora: Efficient finetuning of quantized llms. arXiv e-prints, pages arXiv–2305, 2023.
- Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2021.
- Evaluating GPT-4 and ChatGPT on Japanese medical licensing examinations. arXiv preprint arXiv:2303.18027, 2023.
- Jglue: Japanese general language understanding evaluation. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2957–2966, 2022.
- Peft: State-of-the-art parameter-efficient fine-tuning methods. https://github.com/huggingface/peft, 2022.
- Large language models sensitivity to the order of options in multiple-choice questions. arXiv preprint arXiv:2308.11483, 2023.
- Large language models encode clinical knowledge. Nature, pages 1–9, 2023.
- Towards expert-level medical question answering with large language models. arXiv preprint arXiv:2305.09617, 2023.
- JMedRoBERTa: a japanese pre-trained language model on academic articles in medical sciences (in Japanese). In Proceedings of the 29th Annual Meeting of the Association for Natural Language Processing, 2023.
- From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models. arXiv preprint arXiv:2309.03412, 2023.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
- Towards generalist biomedical ai. arXiv preprint arXiv:2307.14334, 2023.
- Finetuned language models are zero-shot learners. In International Conference on Learning Representations, 2022.
- On large language models’ selection bias in multi-choice questions. arXiv preprint arXiv:2309.03882, 2023.
- Lima: Less is more for alignment. arXiv preprint arXiv:2305.11206, 2023.
- Issey Sukeda (9 papers)
- Masahiro Suzuki (55 papers)
- Hiroki Sakaji (21 papers)
- Satoshi Kodera (7 papers)