Extraction of Pharmacokinetic Evidence of Drug-drug Interactions from the Literature (1412.0744v2)

Published 2 Dec 2014 in stat.ML, cs.IR, and q-bio.QM

Abstract: Drug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. Though DDI is investigated in domains ranging in scale from intracellular biochemistry to human populations, literature mining has not been used to extract specific types of experimental evidence, which are reported differently for distinct experimental goals. We focus on pharmacokinetic evidence for DDI, essential for identifying causal mechanisms of putative interactions and as input for further pharmacological and pharmaco-epidemiology investigations. We used manually curated corpora of PubMed abstracts and annotated sentences to evaluate the efficacy of literature mining on two tasks: first, identifying PubMed abstracts containing pharmacokinetic evidence of DDIs; second, extracting sentences containing such evidence from abstracts. We implemented a text mining pipeline and evaluated it using several linear classifiers and a variety of feature transforms. The most important textual features in the abstract and sentence classification tasks were analyzed. We also investigated the performance benefits of using features derived from PubMed metadata fields, various publicly available named entity recognizers, and pharmacokinetic dictionaries. Several classifiers performed very well in distinguishing relevant and irrelevant abstracts (reaching F1~=0.93, MCC~=0.74, iAUC~=0.99) and sentences (F1~=0.76, MCC~=0.65, iAUC~=0.83). We found that word bigram features were important for achieving optimal classifier performance and that features derived from Medical Subject Headings (MeSH) terms significantly improved abstract classification. ...

Authors (5)

Artemy Kolchinsky (47 papers)
Anália Lourenço (3 papers)
Heng-Yi Wu (2 papers)
Lang Li (18 papers)
Luis M. Rocha (33 papers)

Citations (166)

View on Semantic Scholar

Summary

The paper presents a text mining method using linear classifiers with unigram and bigram features to automatically extract pharmacokinetic DDI evidence from biomedical literature.
The methodology achieved F1 scores up to 0.93 for abstract classification and 0.76 for sentence extraction, showing the effectiveness of linear classifiers and specific textual features like bigrams.
This research significantly advances the automated extraction of DDI evidence, potentially reducing adverse drug reactions and informing clinical and pharmaceutical decision-making.

Extraction of Pharmacokinetic Evidence of Drug-drug Interactions from the Literature

The academic paper by Kolchinsky et al. presents a comprehensive paper on the automatic extraction of pharmacokinetic (PK) evidence for drug-drug interactions (DDIs) from biomedical literature using literature mining techniques. As DDIs are significant contributors to adverse drug reactions, the ability to automate their detection through text mining has vital implications for public health and biomedical research. The authors have approached this challenge by implementing a text mining pipeline capable of identifying and extracting pharmacokinetic evidence from PubMed abstracts and sentences within these abstracts.

Core Methodology and Findings

The paper outlines a robust methodology involving manually curated corpora from PubMed abstracts, a range of linear classifiers, and various feature transformations. Their approach involved creating distinct corpora annotated for pharmacokinetic DDI evidence and then applying text classification tasks to identify relevant abstracts and extract sentences containing such evidence. By comparing the performance of several classifiers — including Logistic Regression, SVM, LDA, Naive Bayes, and others — the authors demonstrated that linear classifiers can achieve high performance levels on both tasks. Notably, classifiers incorporating both unigram and bigram features performed better than those using unigrams alone, with the LDA classifier achieving the highest classification performance.

Significant numerical results include achieving F1 scores of approximately 0.93 for abstract classification and 0.76 for sentence extraction, illustrating robust classification capabilities. The paper discusses the importance of textual features, MeSH terms from PubMed metadata, and named entity recognition (NER) tools in enhancing classification performance. The inclusion of bigrams and selected NER features, notably BICEPP, significantly improved classifier effectiveness. However, the paper acknowledges a potential ceiling on performance with linear classifiers, suggesting room for further improvements possibly with non-linear approaches or enhanced feature annotation.

Implications and Future Directions

This research contributes to the field by enhancing methods for efficient, automated extraction of DDIs, focusing on the specific PK evidence type — pivotal for understanding causal mechanisms and fostering deeper pharmaco-epidemiological studies. The implications are substantial, potentially reducing the incidence of ADRs through better DDI identification and facilitating informed clinical and pharmaceutical decision-making.

For future work, the authors express intentions to extend the current methodology to incorporate other types of DDI evidence, such as clinical data, which, combined with PK evidence, could offer a more comprehensive understanding of DDIs. Additionally, expanding the scope of their literature mining to larger corpora could address existing knowledge gaps in the DDI domain, aiding drug development and optimization of therapeutic practices.

Conclusion

Kolchinsky et al.'s paper presents a well-substantiated paper highlighting the capabilities of current literature mining techniques for identifying PK evidence of DDIs. By demonstrating high classification performance, the work lays the foundation for integrating automated text mining into DDI discovery pipelines and contributes to the broader goal of mitigating adverse drug interactions. The paper sets the stage for future advancements in automatic evidence extraction, promising to enrich pharmacological research and enhance patient safety with informed drug administration protocols.