BiMediX: Bilingual Medical Mixture of Experts LLM (2402.13253v2)

Published 20 Feb 2024 in cs.CL

Abstract: In this paper, we introduce BiMediX, the first bilingual medical mixture of experts LLM designed for seamless interaction in both English and Arabic. Our model facilitates a wide range of medical interactions in English and Arabic, including multi-turn chats to inquire about additional details such as patient symptoms and medical history, multiple-choice question answering, and open-ended question answering. We propose a semi-automated English-to-Arabic translation pipeline with human refinement to ensure high-quality translations. We also introduce a comprehensive evaluation benchmark for Arabic medical LLMs. Furthermore, we introduce BiMed1.3M, an extensive Arabic-English bilingual instruction set covering 1.3 Million diverse medical interactions, resulting in over 632 million healthcare specialized tokens for instruction tuning. Our BiMed1.3M dataset includes 250k synthesized multi-turn doctor-patient chats and maintains a 1:2 Arabic-to-English ratio. Our model outperforms state-of-the-art Med42 and Meditron by average absolute gains of 2.5% and 4.1%, respectively, computed across multiple medical evaluation benchmarks in English, while operating at 8-times faster inference. Moreover, our BiMediX outperforms the generic Arabic-English bilingual LLM, Jais-30B, by average absolute gains of 10% on our Arabic medical benchmark and 15% on bilingual evaluations across multiple datasets. Our project page with source code and trained model is available at https://github.com/mbzuai-oryx/BiMediX .

View on arXiv

References (46)

Authors (7)

Sara Pieri (5 papers)
Sahal Shaji Mullappilly (9 papers)
Fahad Shahbaz Khan (225 papers)
Rao Muhammad Anwer (67 papers)
Salman Khan (244 papers)
Timothy Baldwin (125 papers)
Hisham Cholakkal (78 papers)

Citations (9)

View on Semantic Scholar

Summary

Overview of "BiMediX: Bilingual Medical Mixture of Experts LLM"

The paper introduces BiMediX, a sophisticated bilingual LLM designed to address medical inquiries in both English and Arabic languages seamlessly. The predominant challenge it addresses is the absence of an LLM capable of effectively managing medical conversations across these two languages, particularly focusing on multi-turn interactions crucial in medical consultations. BiMediX extends its functionality to diverse medical interactions, including multi-turn chats, multiple-choice question answering (MCQA), and open-ended question answering (QA).

The research introduces a semi-automated English-to-Arabic translation pipeline that combines machine translation with human refinement, ensuring the translations maintain high fidelity to the original text. Additionally, the paper presents BiMed1.3M, a comprehensive bilingual dataset designed for instructing and fine-tuning the model. This dataset incorporates over 1.3 million diverse medical interactions, effectively bridging the gap in Arabic medical language processing, which has traditionally been constrained by a lack of resources.

Key Contributions and Numerical Results

BiMediX shows substantial advancements over leading models in the medical LLM field, such as Med42 and Meditron, with an average absolute improvement of 2.5% and 4.1% respectively across multiple English medical evaluation benchmarks. Remarkably, it achieves these gains while offering an 8-times faster inference, demonstrating both efficiency and superior performance. Furthermore, BiMediX outperforms the generic Jais-30B bilingual LLM by 10% on its Arabic medical benchmark and improves upon this in bilingual evaluations by 15%.

The novel BiMed1.3M dataset is crucial to this performance leap. It covers over 632 million healthcare-specialized tokens and includes over 250,000 synthesized multi-turn doctor-patient chat exchanges. The dataset, combined with parameter-efficient tuning strategies using a mixture of experts architecture, underpins BiMediX’s bilingual capabilities and substantial performance gains.

Theoretical and Practical Implications

The introduction of a bilingual medical LLM equipped with seamless interaction capabilities in Arabic and English presents substantial practice-oriented implications. It enables enhanced accessibility and accuracy in medical diagnosis and consultation for Arabic-speaking populations, addressing an important gap given the linguistic and resource constraints faced by previous models. The use of mixture of experts architectures also highlights practical efficiency by offering significant performance improvements with lower computational overhead during inference, which is critical for real-time applications in medical settings.

Theoretically, the paper advances LLM capabilities by exploring the utilities of bilingual datasets and domain-specific translation pipelines. It reflects on methodologies to overcome constraints in language-specific resources and provides a benchmark for evaluating bilingual medical LLMs, which could guide future research efforts aiming to introduce support for additional languages.

Future Directions

The developments and contributions discussed suggest several avenues for future research. Extending the multilingual capability by incorporating additional languages and examining the scalability of BiMediX’s architecture across broader linguistic pairs are promising directions. Further exploration of domain-specific fine-tuning methodologies could enhance the model's practical applications in diverse real-world scenarios, including specialized medical fields beyond the current dataset's scope.

The paper by Pieri et al. offers a substantial leap in bilingual LLM development, bridging critical gaps in medical AI applications for non-English speaking populations and setting the groundwork for further advancements in multilingual medical AI technologies.

PDF Markdown

Related Papers

Find Related Papers

Tweets

https://twitter.com/AI_inAM/status/1760182203496223040

YouTube

Show All Videos