Simplify the Usage of Lexicon in Chinese NER (1908.05969v2)

Published 16 Aug 2019 in cs.CL

Abstract: Recently, many works have tried to augment the performance of Chinese named entity recognition (NER) using word lexicons. As a representative, Lattice-LSTM (Zhang and Yang, 2018) has achieved new benchmark results on several public Chinese NER datasets. However, Lattice-LSTM has a complex model architecture. This limits its application in many industrial areas where real-time NER responses are needed. In this work, we propose a simple but effective method for incorporating the word lexicon into the character representations. This method avoids designing a complicated sequence modeling architecture, and for any neural NER model, it requires only subtle adjustment of the character representation layer to introduce the lexicon information. Experimental studies on four benchmark Chinese NER datasets show that our method achieves an inference speed up to 6.15 times faster than those of state-ofthe-art methods, along with a better performance. The experimental results also show that the proposed method can be easily incorporated with pre-trained models like BERT.

Authors (4)

Ruotian Ma (19 papers)
Minlong Peng (18 papers)
Qi Zhang (785 papers)
Xuanjing Huang (287 papers)

Citations (240)

View on Semantic Scholar

Summary

Simplifying Lexicon Usage in Chinese Named Entity Recognition

The paper "Simplify the Usage of Lexicon in Chinese NER" offers a notable contribution to the advancement of Chinese Named Entity Recognition (NER) methodologies by addressing the inherent complexities associated with incorporating lexicon information. The authors recognize the pivotal role that word lexicons play in improving the accuracy of NER systems by providing additional word-level context, yet also acknowledge the operational inefficiencies that manifest in existing approaches such as Lattice-LSTM, particularly with regards to complexity and speed.

Chinese NER faces unique challenges due to the absence of explicit word boundaries, unlike languages such as English. This paper proposes a simplified method to seamlessly integrate lexicon information directly into character representations within neural NER models, thus circumventing the need for more convoluted network structures. The authors introduce a novel approach referred to as SoftLexicon, which retains both inference efficiency and the capacity to exploit word-level information. Core to this strategy is the encoding of lexicon-derived features directly within the character representation layer, allowing for a versatile and transferable framework compatible with existing neural architectures, including pre-trained models like BERT.

The empirical evaluations demonstrate that the SoftLexicon method not only matches but in some cases surpasses the performance of prior state-of-the-art models on several Chinese NER benchmark datasets. Key results indicate a dramatic increase in inference speeds, up to 6.15 times faster compared with Lattice-LSTM, without sacrificing accuracy. Such efficiency gains are critical for practical applications requiring real-time processing, thereby enhancing the method's industrial applicability.

In terms of methodological contributions, the authors delineate a process whereby lexicon information is categorized into four distinct groups—Begin, Middle, End, and Singleton—corresponding to word positions within sequences. These categorizations are effectively condensed into weighted representations that augment character embeddings, thereby enriching the input data without the added complexity of graph-based structures.

This paper holds significant implications for both theoretical and practical development within the domain of NER. Theoretically, it bridges the gap between the availability of lexicon resources and their practical application in sequence labeling tasks, thus redefining the balance between complexity and efficacy in model design. Practically, the simplification is poised to facilitate broader adoption in diverse applications such as information retrieval and knowledge base construction.

Future research initiatives may focus on exploring extensions of this method to other languages or domains with similar constraints, as well as further optimizing the integration of pre-trained models to harness the full potential of contextual embeddings in conjunction with lexicon information. This work sets a foundation for future exploration and potentially paves the way for more innovative approaches to tackling the unique challenges presented by non-segmented languages in NER tasks.

PDF Markdown

Simplify the Usage of Lexicon in Chinese NER (1908.05969v2)

Summary

Simplifying Lexicon Usage in Chinese Named Entity Recognition

Related Papers