Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

51 tokens/sec

GPT-4o

60 tokens/sec

Gemini 2.5 Pro Pro

44 tokens/sec

o3 Pro

8 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

38 2

BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains (2402.10373v3)

Published 15 Feb 2024 in cs.CL, cs.AI, and cs.LG

Abstract: LLMs have demonstrated remarkable versatility in recent years, offering potential applications across specialized domains such as healthcare and medicine. Despite the availability of various open-source LLMs tailored for health contexts, adapting general-purpose LLMs to the medical domain presents significant challenges. In this paper, we introduce BioMistral, an open-source LLM tailored for the biomedical domain, utilizing Mistral as its foundation model and further pre-trained on PubMed Central. We conduct a comprehensive evaluation of BioMistral on a benchmark comprising 10 established medical question-answering (QA) tasks in English. We also explore lightweight models obtained through quantization and model merging approaches. Our results demonstrate BioMistral's superior performance compared to existing open-source medical models and its competitive edge against proprietary counterparts. Finally, to address the limited availability of data beyond English and to assess the multilingual generalization of medical LLMs, we automatically translated and evaluated this benchmark into 7 other languages. This marks the first large-scale multilingual evaluation of LLMs in the medical domain. Datasets, multilingual evaluation benchmarks, scripts, and all the models obtained during our experiments are freely released.

PDF HTML Abstract

Enhancing Medical Domain Understanding with BioMistral: Open-Source Pretrained LLMs

Introduction to BioMistral

The paper presents BioMistral, a set of Open-Source Pretrained LLMs optimized for applications within the medical domain. Based on the Mistral foundation model and enriched by further pretraining on PubMed Central, BioMistral represents a significant step towards making robust, domain-specific NLP capabilities more accessible to researchers and practitioners in the field of healthcare and medicine.

Distinctive Features of BioMistral

BioMistral introduces several innovations and improvements over existing medical LLMs:

Tailored Domain Optimization: Through further pre-training Mistral on a meticulously curated subset of PubMed Central, BioMistral achieves superior performance on a wide array of medical QA tasks.
Multilingual Evaluation: It expands the evaluation landscape by translating a benchmark of 10 medical QA tasks into seven languages, thus assessing the multilingual efficacy of medical LLMs at a scale previously unexplored.
Efficiency through Quantization: Through various quantization and model merging techniques, BioMistral models exhibit not just excellence in performance but also in operational efficiency, making them amenable for deployment on consumer-grade hardware.

Comprehensive Evaluation

BioMistral underwent a rigorous evaluation on a novel benchmark comprising 10 medical QA tasks. It demonstrated statistically significant improvements over other open-source medical models and holds its ground against proprietary models in terms of performance. In multilingual contexts, although there's an observable performance dip compared to English tasks, BioMistral's impressive array of LLMs still outperforms existing models, underscoring its robustness and adaptability across linguistic boundaries.

The Mechanics of Model Adaptation

The adaptation method involves pre-training the Mistral model using a dataset drawn from the PMC Open Access Subset to embed biomedical specificity into BioMistral. This process, aimed at enhancing the model's understanding of complex medical contexts, employs various strategies including AdamW optimization and Grouped-Query Attention, ensuring the model's adeptness at medical domain tasks.

Model Merging and Quantization Strategies

Model merging experiments, using techniques such as SLERP and TIES, indicated that combining specialized and general-domain models can result in improved performance and generalization capabilities. Furthermore, experiments with activation-aware weight quantization and other strategies underscore the potential for deploying BioMistral on devices with limited computational resources without significant loss in performance.

Practical Implications and Future Prospects

BioMistral holds promise for a variety of applications in healthcare and medicine, from enhancing medical literature search capabilities to facilitating patient care through improved understanding of medical queries. Its open-source nature invites further experimentation and adaptation by the global research community. The work paves the way for future developments, particularly in advancing model calibration, reliability, and multilingual capabilities, as well as exploring domain-specific adaptations beyond the sphere of medicine.

Key Contributions

Domain-Specific Pretraining: Leveraging PubMed Central to train Mistral model variants tailored for the biomedical domain.
Multilingual Benchmark Creation: Extending the evaluation of medical LLMs to additional languages.
Advanced Model Quantization: Implementing quantization techniques that allow performance optimization without sacrificing accuracy.

Conclusion

BioMistral represents a significant advancement in the development of domain-specific LLMs for the biomedical field, showing marked improvements over existing models across a range of metrics. By combining the foundational strengths of Mistral with advanced pre-training and model optimization techniques, BioMistral emerges as a powerful tool for researchers and practitioners working at the intersection of AI and healthcare. The open-source release of datasets, benchmarks, and models underlines the authors' commitment to transparency and collaboration in advancing the state of the art in medical NLP.

PDF Markdown Bookmark Chat (Pro)

References (78)

Authors (6)

Yanis Labrak (12 papers)
Adrien Bazoge (6 papers)
Emmanuel Morin (13 papers)
Pierre-Antoine Gourraud (5 papers)
Mickael Rouvier (25 papers)
Richard Dufour (33 papers)

Citations (134)

View on Semantic Scholar

Tweets

https://twitter.com/strnr/status/1759720258149974324

https://twitter.com/bergr7/status/1803501691783840067

https://twitter.com/strnr/status/1761062504686383114

https://twitter.com/DavidMezzetti/status/1759568563197661328

https://twitter.com/woojinrad/status/1760329033441591609

https://twitter.com/rebels_ai/status/1759970523981778996

YouTube

Show All Videos