SpanNER: Named Entity Re-/Recognition as Span Prediction (2106.00641v2)

Published 1 Jun 2021 in cs.CL

Abstract: Recent years have seen the paradigm shift of Named Entity Recognition (NER) systems from sequence labeling to span prediction. Despite its preliminary effectiveness, the span prediction model's architectural bias has not been fully understood. In this paper, we first investigate the strengths and weaknesses when the span prediction model is used for named entity recognition compared with the sequence labeling framework and how to further improve it, which motivates us to make complementary advantages of systems based on different paradigms. We then reveal that span prediction, simultaneously, can serve as a system combiner to re-recognize named entities from different systems' outputs. We experimentally implement 154 systems on 11 datasets, covering three languages, comprehensive results show the effectiveness of span prediction models that both serve as base NER systems and system combiners. We make all code and datasets available: \url{https://github.com/neulab/spanner}, as well as an online system demo: \url{http://spanner.sh}. Our model also has been deployed into the ExplainaBoard platform, which allows users to flexibly perform a system combination of top-scoring systems in an interactive way: \url{http://explainaboard.nlpedia.ai/leaderboard/task-ner/}.

Authors (3)

Jinlan Fu (36 papers)
Xuanjing Huang (287 papers)
Pengfei Liu (191 papers)

Citations (90)

View on Semantic Scholar

Summary

SpanNer: Advancements in Named Entity Recognition through Span Prediction

The research paper "SpanNer: Named Entity Re-/Recognition as Span Prediction" explores the evolution and comparative efficacy of named entity recognition (NER) systems by transitioning from traditional sequence labeling approaches to span prediction methods. This paper provides an empirical analysis demonstrating the strengths and limitations of span prediction for entity recognition and introduces a novel method for leveraging these systems as combiners for multiple NER systems.

Overview and Methodology

The recognition of named entities has predominantly been approached through sequence labeling frameworks, which assign labels to individual tokens in a text sequence. However, span prediction reformulates the task by treating entity recognition as a span classification and prediction problem. This paradigm shift allows models to predict multi-token spans as entities directly, facilitating improved handling of complex entity boundaries and nested entities.

The paper investigates the span prediction paradigm's architectural biases by conducting a comprehensive evaluation across multiple datasets and languages. The experimental setup involved implementing 154 systems on 11 datasets, covering languages such as English, German, Dutch, and Spanish.

Key Findings

Performance Metrics: The span prediction models, when evaluated against sequence labeling models, exhibit distinct advantages, particularly in sentences with high out-of-vocabulary content and medium-length entities. Sequence labeling approaches, conversely, tend to perform better with longer entities and consistent labeling scenarios.
System Combination Framework: A significant contribution of this work is demonstrating that span prediction models can function effectively as system combiners. They integrate outputs from various NER systems, yielding considerable gains in accuracy over traditional ensemble methods. This dual utility emphasizes span prediction's architectural strength in adapting to diverse NER frameworks without extensive feature engineering.
Cohesive Results: On datasets such as CoNLL-2003 and OntoNotes 5.0, the proposed span prediction model achieved significantly higher F1 scores compared to existing models, particularly in noisy and variable conditions like social media datasets (e.g., WNUT).

Implications for Future Research

The implications of span prediction extend beyond improving entity recognition accuracy. This research underscores the potential flexibility and adaptability of span-based architectures in NLP tasks that require holistic contextual understanding, such as coreference resolution or parsing tasks. The findings advocate for further exploration into how span-based approaches can unify disparate NLP tasks under a coherent model training and inference framework.

The research also sets the stage for future developments in AI, where modular and adaptable systems like SpanNer could streamline integration processes across multiple languages and contexts, offering real-time and scalable solutions for multilingual applications.

The authors have made their datasets and code available publicly, which is an invitation for the research community to validate, replicate, and extend their findings. This aspect emphasizes a collaborative approach to refining and optimizing named entity recognition systems further, ensuring continuous evolution.

Conclusion

In summary, the paper "SpanNer: Named Entity Re-/Recognition as Span Prediction" provides a thorough analysis of span prediction's role in named entity recognition and its applicability as a unifying system combiner. It delineates the strengths of span models over conventional sequence labeling approaches and opens avenues for their application in more intricate natural language processing tasks. The research conveys clear advancements in enhancing the robustness and flexibility of NER systems, with profound implications for the field of artificial intelligence and natural language understanding.

PDF Markdown

Related Papers

GitHub

GitHub - neulab/SpanNER: SpanNER: Named EntityRe-/Recognition as Span Prediction (123 stars)