Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CU-UD: text-mining drug and chemical-protein interactions with ensembles of BERT-based models (2112.03004v1)

Published 11 Nov 2021 in cs.CL and cs.AI

Abstract: Identifying the relations between chemicals and proteins is an important text mining task. BioCreative VII track 1 DrugProt task aims to promote the development and evaluation of systems that can automatically detect relations between chemical compounds/drugs and genes/proteins in PubMed abstracts. In this paper, we describe our submission, which is an ensemble system, including multiple BERT-based LLMs. We combine the outputs of individual models using majority voting and multilayer perceptron. Our system obtained 0.7708 in precision and 0.7770 in recall, for an F1 score of 0.7739, demonstrating the effectiveness of using ensembles of BERT-based LLMs for automatically detecting relations between chemicals and proteins. Our code is available at https://github.com/bionlplab/drugprot_bcvii.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Mehmet Efruz Karabulut (2 papers)
  2. K. Vijay-Shanker (10 papers)
  3. Yifan Peng (147 papers)
Citations (3)