Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enriching Medcial Terminology Knowledge Bases via Pre-trained Language Model and Graph Convolutional Network (1909.00615v1)

Published 2 Sep 2019 in cs.CL

Abstract: Enriching existing medical terminology knowledge bases (KBs) is an important and never-ending work for clinical research because new terminology alias may be continually added and standard terminologies may be newly renamed. In this paper, we propose a novel automatic terminology enriching approach to supplement a set of terminologies to KBs. Specifically, terminology and entity characters are first fed into pre-trained LLM to obtain semantic embedding. The pre-trained model is used again to initialize the terminology and entity representations, then they are further embedded through graph convolutional network to gain structure embedding. Afterwards, both semantic and structure embeddings are combined to measure the relevancy between the terminology and the entity. Finally, the optimal alignment is achieved based on the order of relevancy between the terminology and all the entities in the KB. Experimental results on clinical indicator terminology KB, collected from 38 top-class hospitals of Shanghai Hospital Development Center, show that our proposed approach outperforms baseline methods and can effectively enrich the KB.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jiaying Zhang (2 papers)
  2. Zhixing Zhang (14 papers)
  3. Huanhuan Zhang (9 papers)
  4. Zhiyuan Ma (70 papers)
  5. Yangming Zhou (27 papers)
  6. Ping He (58 papers)

Summary

We haven't generated a summary for this paper yet.