Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BERT-CNN: a Hierarchical Patent Classifier Based on a Pre-Trained Language Model (1911.06241v1)

Published 3 Nov 2019 in cs.CL and cs.LG

Abstract: The automatic classification is a process of automatically assigning text documents to predefined categories. An accurate automatic patent classifier is crucial to patent inventors and patent examiners in terms of intellectual property protection, patent management, and patent information retrieval. We present BERT-CNN, a hierarchical patent classifier based on pre-trained LLM by training the national patent application documents collected from the State Information Center, China. The experimental results show that BERT-CNN achieves 84.3% accuracy, which is far better than the two compared baseline methods, Convolutional Neural Networks and Recurrent Neural Networks. We didn't apply our model to the third and fourth hierarchical level of the International Patent Classification - "subclass" and "group".The visualization of the Attention Mechanism shows that BERT-CNN obtains new state-of-the-art results in representing vocabularies and semantics. This article demonstrates the practicality and effectiveness of BERT-CNN in the field of automatic patent classification.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Xiaolei Lu (13 papers)
  2. Bin Ni (5 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.