Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 87 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 35 tok/s
GPT-5 High 38 tok/s Pro
GPT-4o 85 tok/s
GPT OSS 120B 468 tok/s Pro
Kimi K2 203 tok/s Pro
2000 character limit reached

BERT-based knowledge extraction method of unstructured domain text (2103.00728v1)

Published 1 Mar 2021 in cs.CL and cs.LG

Abstract: With the development and business adoption of knowledge graph, there is an increasing demand for extracting entities and relations of knowledge graphs from unstructured domain documents. This makes the automatic knowledge extraction for domain text quite meaningful. This paper proposes a knowledge extraction method based on BERT, which is used to extract knowledge points from unstructured specific domain texts (such as insurance clauses in the insurance industry) automatically to save manpower of knowledge graph construction. Different from the commonly used methods which are based on rules, templates or entity extraction models, this paper converts the domain knowledge points into question and answer pairs and uses the text around the answer in documents as the context. The method adopts a BERT-based model similar to BERT's SQuAD reading comprehension task. The model is fine-tuned. And it is used to directly extract knowledge points from more insurance clauses. According to the test results, the model performance is good.

Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.