Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Chemical-protein Interaction Extraction via Gaussian Probability Distribution and External Biomedical Knowledge (1911.09487v2)

Published 21 Nov 2019 in cs.CL

Abstract: Motivation: The biomedical literature contains a wealth of chemical-protein interactions (CPIs). Automatically extracting CPIs described in biomedical literature is essential for drug discovery, precision medicine, as well as basic biomedical research. Most existing methods focus only on the sentence sequence to identify these CPIs. However, the local structure of sentences and external biomedical knowledge also contain valuable information. Effective use of such information may improve the performance of CPI extraction. Results: In this paper, we propose a novel neural network-based approach to improve CPI extraction. Specifically, the approach first employs BERT to generate high-quality contextual representations of the title sequence, instance sequence, and knowledge sequence. Then, the Gaussian probability distribution is introduced to capture the local structure of the instance. Meanwhile, the attention mechanism is applied to fuse the title information and biomedical knowledge, respectively. Finally, the related representations are concatenated and fed into the softmax function to extract CPIs. We evaluate our proposed model on the CHEMPROT corpus. Our proposed model is superior in performance as compared with other state-of-the-art models. The experimental results show that the Gaussian probability distribution and external knowledge are complementary to each other. Integrating them can effectively improve the CPI extraction performance. Furthermore, the Gaussian probability distribution can effectively improve the extraction performance of sentences with overlapping relations in biomedical relation extraction tasks. Availability: Data and code are available at https://github.com/CongSun-dlut/CPI_extraction. Contact: [email protected], [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Cong Sun (25 papers)
  2. Zhihao Yang (10 papers)
  3. Leilei Su (2 papers)
  4. Lei Wang (975 papers)
  5. Yin Zhang (98 papers)
  6. Hongfei Lin (34 papers)
  7. Jian Wang (966 papers)
Citations (24)
Github Logo Streamline Icon: https://streamlinehq.com