Papers
Topics
Authors
Recent
Search
2000 character limit reached

SINA-BERT: A pre-trained Language Model for Analysis of Medical Texts in Persian

Published 15 Apr 2021 in cs.CL | (2104.07613v1)

Abstract: We have released Sina-BERT, a LLM pre-trained on BERT (Devlin et al., 2018) to address the lack of a high-quality Persian LLM in the medical domain. SINA-BERT utilizes pre-training on a large-scale corpus of medical contents including formal and informal texts collected from a variety of online resources in order to improve the performance on health-care related tasks. We employ SINA-BERT to complete following representative tasks: categorization of medical questions, medical sentiment analysis, and medical question retrieval. For each task, we have developed Persian annotated data sets for training and evaluation and learnt a representation for the data of each task especially complex and long medical questions. With the same architecture being used across tasks, SINA-BERT outperforms BERT-based models that were previously made available in the Persian language.

Citations (7)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.