Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

JMedLoRA:Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning (2310.10083v2)

Published 16 Oct 2023 in cs.CL

Abstract: In the ongoing wave of impact driven by LLMs like ChatGPT, the adaptation of LLMs to medical domain has emerged as a crucial research frontier. Since mainstream LLMs tend to be designed for general-purpose applications, constructing a medical LLM through domain adaptation is a huge challenge. While instruction-tuning is used to fine-tune some LLMs, its precise roles in domain adaptation remain unknown. Here we show the contribution of LoRA-based instruction-tuning to performance in Japanese medical question-answering tasks. In doing so, we employ a multifaceted evaluation for multiple-choice questions, including scoring based on "Exact match" and "Gestalt distance" in addition to the conventional accuracy. Our findings suggest that LoRA-based instruction-tuning can partially incorporate domain-specific knowledge into LLMs, with larger models demonstrating more pronounced effects. Furthermore, our results underscore the potential of adapting English-centric models for Japanese applications in domain adaptation, while also highlighting the persisting limitations of Japanese-centric models. This initiative represents a pioneering effort in enabling medical institutions to fine-tune and operate models without relying on external services.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Qlora: Efficient finetuning of quantized llms. arXiv e-prints, pages arXiv–2305, 2023.
  2. Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2021.
  3. Evaluating GPT-4 and ChatGPT on Japanese medical licensing examinations. arXiv preprint arXiv:2303.18027, 2023.
  4. Jglue: Japanese general language understanding evaluation. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2957–2966, 2022.
  5. Peft: State-of-the-art parameter-efficient fine-tuning methods. https://github.com/huggingface/peft, 2022.
  6. Large language models sensitivity to the order of options in multiple-choice questions. arXiv preprint arXiv:2308.11483, 2023.
  7. Large language models encode clinical knowledge. Nature, pages 1–9, 2023.
  8. Towards expert-level medical question answering with large language models. arXiv preprint arXiv:2305.09617, 2023.
  9. JMedRoBERTa: a japanese pre-trained language model on academic articles in medical sciences (in Japanese). In Proceedings of the 29th Annual Meeting of the Association for Natural Language Processing, 2023.
  10. From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models. arXiv preprint arXiv:2309.03412, 2023.
  11. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  12. Towards generalist biomedical ai. arXiv preprint arXiv:2307.14334, 2023.
  13. Finetuned language models are zero-shot learners. In International Conference on Learning Representations, 2022.
  14. On large language models’ selection bias in multi-choice questions. arXiv preprint arXiv:2309.03882, 2023.
  15. Lima: Less is more for alignment. arXiv preprint arXiv:2305.11206, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Issey Sukeda (9 papers)
  2. Masahiro Suzuki (55 papers)
  3. Hiroki Sakaji (21 papers)
  4. Satoshi Kodera (7 papers)
Citations (5)