2000 character limit reached
A Survey of Spanish Clinical Language Models (2308.02199v1)
Published 4 Aug 2023 in cs.CL, cs.AI, and cs.LG
Abstract: This survey focuses in encoder LLMs for solving tasks in the clinical domain in the Spanish language. We review the contributions of 17 corpora focused mainly in clinical tasks, then list the most relevant Spanish LLMs and Spanish Clinical LLMs. We perform a thorough comparison of these models by benchmarking them over a curated subset of the available corpora, in order to find the best-performing ones; in total more than 3000 models were fine-tuned for this study. All the tested corpora and the best models are made publically available in an accessible way, so that the results can be reproduced by independent teams or challenged in the future when new Spanish Clinical LLMs are created.