Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EpilepsyLLM: Domain-Specific Large Language Model Fine-tuned with Epilepsy Medical Knowledge (2401.05908v1)

Published 11 Jan 2024 in cs.CL and cs.LG

Abstract: With large training datasets and massive amounts of computing sources, LLMs achieve remarkable performance in comprehensive and generative ability. Based on those powerful LLMs, the model fine-tuned with domain-specific datasets posseses more specialized knowledge and thus is more practical like medical LLMs. However, the existing fine-tuned medical LLMs are limited to general medical knowledge with English language. For disease-specific problems, the model's response is inaccurate and sometimes even completely irrelevant, especially when using a language other than English. In this work, we focus on the particular disease of Epilepsy with Japanese language and introduce a customized LLM termed as EpilepsyLLM. Our model is trained from the pre-trained LLM by fine-tuning technique using datasets from the epilepsy domain. The datasets contain knowledge of basic information about disease, common treatment methods and drugs, and important notes in life and work. The experimental results demonstrate that EpilepsyLLM can provide more reliable and specialized medical knowledge responses.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. Language models are few-shot learners. In Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 1877–1901, 2020.
  2. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare, 3(1):1–23, 2021.
  3. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6):bbac409, 2022.
  4. OpenAI. GPT-4 technical report. arXiv 2303.08774, 2023.
  5. World Health Organization et al. Epilepsy: a public health imperative. 2019.
  6. Training language models to follow instructions with human feedback. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), volume 35, pages 27730–27744, 2022.
  7. Alpaca: A strong, replicable instruction-following model. Stanford Center for Research on Foundation Models. https://crfm. stanford. edu/2023/03/13/alpaca. html, 3(6):7, 2023.
  8. LLaMA: Open and efficient foundation language models. arXiv:2302.13971, 2023.
  9. Towards generalist biomedical ai. arXiv:2307.14334, 2023.
  10. Attention is all you need. In Advances in neural information processing systems (NIPS), volume 30, 2017.
  11. BioMedLM: a domain-specific large language model for biomedical text. MosaicML. Accessed: Dec, 23(3):2, 2022.
  12. LinkBERT: Pretraining language models with document links. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pages 8003–8016, 2022.
  13. ChatDoctor: A medical chat model fine-tuned on llama model using medical domain knowledge. arXiv:2303.14070, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Xuyang Zhao (13 papers)
  2. Qibin Zhao (66 papers)
  3. Toshihisa Tanaka (19 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets