Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SECNLP: A Survey of Embeddings in Clinical Natural Language Processing (1903.01039v4)

Published 4 Mar 2019 in cs.CL

Abstract: Traditional representations like Bag of words are high dimensional, sparse and ignore the order as well as syntactic and semantic information. Distributed vector representations or embeddings map variable length text to dense fixed length vectors as well as capture the prior knowledge which can transferred to downstream tasks. Even though embedding has become de facto standard for representations in deep learning based NLP tasks in both general and clinical domains, there is no survey paper which presents a detailed review of embeddings in Clinical Natural Language Processing. In this survey paper, we discuss various medical corpora and their characteristics, medical codes and present a brief overview as well as comparison of popular embeddings models. We classify clinical embeddings into nine types and discuss each embedding type in detail. We discuss various evaluation methods followed by possible solutions to various challenges in clinical embeddings. Finally, we conclude with some of the future directions which will advance the research in clinical embeddings.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Kalyan KS (1 paper)
  2. S Sangeetha (2 papers)
Citations (77)