DAEDRA: A language model for predicting outcomes in passive pharmacovigilance reporting (2402.10951v1)

Published 10 Feb 2024 in cs.CL and cs.LG

Abstract: Over the recent years, the emergence of LLMs has given rise to a proliferation of domain-specific models that are intended to reflect the particularities of linguistic context and content as a correlate of the originating domain. This paper details the conception, design, training and evaluation of DAEDRA, a LLM designed to detect regulatory-relevant outcomes (mortality, ER attendance and hospitalisation) in adverse event reports elicited through passive reporting (PR). While PR is a highly cost-efficient way of eliciting information from a wide and diverse audience -- typically including not only physicians and healthcare providers but also patients, family members and other lay stakeholders --, this diversity makes PR corpora difficult to analyse. Generic LLMs may not capture the complex clinical dimensions while specific clinical or biomedical models may not perform well on lay reports. To evaluate the utility of a subdomain-specific LLM, an adaptive training approach was adapted, wherein base LLM candidates were evaluated on a subset of the corpus, and the best performer was trained on the entire corpus. This yielded a small but significant improvement in $F_1$ (+1%), precision (+2.5%) and recall (+3.8%), at a relatively low training cost and a single-day training time. Subdomain-specific LLMs continue to be viable options for better results when analysing highly specialised corpora.

References (18)

Authors (1)

Chris von Csefalvay (6 papers)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/gastronomy/status/1759807156147122600

DAEDRA: A language model for predicting outcomes in passive pharmacovigilance reporting (2402.10951v1)

Summary

Related Papers

Tweets