Chinese NER Using Lattice LSTM (1805.02023v4)

Published 5 May 2018 in cs.CL

Abstract: We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.

Authors (2)

Yue Zhang (620 papers)
Jie Yang (517 papers)

Citations (629)

View on Semantic Scholar

Summary

The paper introduces the lattice LSTM model that integrates character and lexicon word inputs to bypass segmentation errors in Chinese NER.
Experimental results across datasets, including an F1 score of 93.18% on MSRA, demonstrate the model's superiority over traditional methods.
The study bridges character-level precision and word-level context, laying a robust foundation for future advancements in sequence labeling tasks.

An Analysis of "Chinese NER Using Lattice LSTM"

The paper presents an innovative approach to Chinese Named Entity Recognition (NER) through the introduction of the lattice LSTM model. This model uniquely integrates a sequence of input characters with potential lexicon words, thus leveraging both character and word sequences for improved entity recognition.

Model Design and Implementation

The lattice LSTM model is designed to circumvent the segmentation errors inherent in word-based methods. Traditional Chinese NER often requires initial word segmentation, which can propagate errors into the NER process due to the challenges associated with accurately segmenting cross-domain data. Character-based methods mitigate this issue but fail to fully utilize word context, potentially leaving beneficial information untapped.

In contrast, the lattice LSTM approaches this by employing a lattice structure where word-character paths are constructed by matching sentences against a pre-built lexicon. This significantly reduces segmentation-related errors by allowing gated recurrent cells to dynamically select and route the most relevant characters and words. This method capitalizes on the explicit word information without sacrificing character-based sequence processing.

Experimental Evaluation

Extensive experiments demonstrate the lattice LSTM model’s superiority over both character-based and word-based LSTM-CRF baselines. The experiments were conducted across multiple datasets including OntoNotes, MSRA, Weibo, and a newly annotated resume dataset. On these datasets, the lattice LSTM achieved higher F1 scores, demonstrating its robustness and adaptability across varying domains. For instance, on the MSRA dataset, the model achieved an F1 score of 93.18%, which marks a noteworthy improvement over previous state-of-the-art models.

Implications and Future Research

The application of lattice LSTMs to Chinese NER presents both practical and theoretical advancements. Practically, this approach offers a segmentation-free method that reduces the complexity and potential errors of preprocessing in NER tasks, especially beneficial in domains with high segmentation ambiguity. Theoretically, it bridges the gap between character-level precision and word-level context, suggesting promising directions for further research in sequence labeling tasks.

Moreover, the use of gated mechanisms to control the flow of information through complex structures like lattices could inspire analogous methods in other language processing areas, including syntactic parsing and sentiment analysis.

Conclusion

The lattice LSTM model constitutes a significant advancement in Chinese NER by effectively utilizing both character and word information while avoiding segmentation pitfalls. This research lays a foundation for developing more robust NER systems and highlights the potential of leveraging lattice structures in other natural language processing tasks. Future exploration could consider integrating semi-supervised techniques, such as LLMing, further enhancing the model's performance and adaptability to diverse datasets.

PDF Markdown

Related Papers

GitHub

GitHub - jiesutd/LatticeLSTM: Chinese NER using Lattice LSTM. Code for ACL 2018 paper. (1,818 stars)