Portuguese Named Entity Recognition using BERT-CRF (1909.10649v2)

Published 23 Sep 2019 in cs.CL, cs.IR, and cs.LG

Abstract: Recent advances in language representation using neural networks have made it viable to transfer the learned internal states of a trained model to downstream natural language processing tasks, such as named entity recognition (NER) and question answering. It has been shown that the leverage of pre-trained LLMs improves the overall performance on many tasks and is highly beneficial when labeled data is scarce. In this work, we train Portuguese BERT models and employ a BERT-CRF architecture to the NER task on the Portuguese language, combining the transfer capabilities of BERT with the structured predictions of CRF. We explore feature-based and fine-tuning training strategies for the BERT model. Our fine-tuning approach obtains new state-of-the-art results on the HAREM I dataset, improving the F1-score by 1 point on the selective scenario (5 NE classes) and by 4 points on the total scenario (10 NE classes).

PDF Abstract

Portuguese Named Entity Recognition using BERT-CRF: An Expert Overview

The paper "Portuguese Named Entity Recognition using BERT-CRF" explores the application of Bidirectional Encoder Representations from Transformers (BERT) combined with Conditional Random Fields (CRF) for the task of Named Entity Recognition (NER) in the Portuguese language. The authors aim to leverage the strengths of BERT's bidirectional contextual embeddings and CRF's structured prediction capabilities to advance the performance of NER tasks in contexts where Portuguese language resources are limited.

Background and Methodology

Named Entity Recognition is a critical task in NLP, involving the identification and classification of named entities in text into predefined categories such as person, organization, and location. Traditional models have addressed this through a variety of approaches, evolving from rule-based systems to machine learning models, and more recently, to deep learning architectures like BiLSTM-CRF and models employing contextual embeddings such as ELMo and Flair.

The approach in this paper involves a BERT-CRF architecture specifically trained on Portuguese data. The BERT model generates contextual embeddings for input sentences, while the CRF layer leverages these embeddings to predict a sequence of entity tags. The model uses two different strategies: feature-based and fine-tuning. The feature-based approach utilizes BERT as a feature extractor without updating its weights during NER training. In contrast, the fine-tuning strategy involves updating all model parameters jointly, including those of BERT, for improved performance.

Experimental Setup and Results

The researchers conducted experiments using the First HAREM and MiniHAREM datasets from the HAREM Golden Collections, which are standard Portuguese NER benchmarks. These datasets contain 10 named entity classes and facilitate evaluating models under both selective and total scenarios.

The paper reports significant improvements in NER performance with the BERT-CRF model, achieving a notable increase in F1-score compared to previous state-of-the-art models. Specifically, the addition of the CRF layer contributed to a more accurate prediction sequence by capturing dependencies between entities.

The results highlight that the fine-tuning approach considerably outperforms the feature-based method, with a marked increase in precision and recall metrics. Interestingly, the Portuguese BERT models showed superior performance over the Multilingual BERT, emphasizing the need for language-specific pre-training.

Implications and Future Research

The strong performance demonstrated by the BERT-CRF model on the Portuguese NER task suggests that the combination of transformer-based contextual embeddings and structured prediction models is a promising direction for enhancing NER systems, especially in resource-constrained languages. By improving entity recognition in Portuguese texts, this research has practical applications in areas such as automated information extraction, digital assistants, and data mining.

Moreover, the provision of Portuguese BERT models and code for public use encourages further exploration and benchmarking on various NLP tasks in Portuguese. Future work could explore the impact of newer and more efficient models like RoBERTa and T5, assessing whether they offer additional gains in performance and computational efficiency. Additionally, applying this approach to other under-resourced languages could extend its benefits beyond Portuguese.

In summary, this paper offers a substantial contribution to the field by presenting an efficacious method for Portuguese NER, setting a new standard for future NLP applications in contextually similar languages.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Fábio Souza (1 paper)
Rodrigo Nogueira (70 papers)
Roberto Lotufo (41 papers)

Citations (232)

View on Semantic Scholar

Portuguese Named Entity Recognition using BERT-CRF (1909.10649v2)

Portuguese Named Entity Recognition using BERT-CRF: An Expert Overview

Background and Methodology

Experimental Setup and Results

Implications and Future Research

Related Papers