Investigating Self-Attention Network for Chinese Word Segmentation (1907.11512v1)

Published 26 Jul 2019 in cs.CL

Abstract: Neural network has become the dominant method for Chinese word segmentation. Most existing models cast the task as sequence labeling, using BiLSTM-CRF for representing the input and making output predictions. Recently, attention-based sequence models have emerged as a highly competitive alternative to LSTMs, which allow better running speed by parallelization of computation. We investigate self attention network for Chinese word segmentation, making comparisons between BiLSTM-CRF models. In addition, the influence of contextualized character embeddings is investigated using BERT, and a method is proposed for integrating word information into SAN segmentation. Results show that SAN gives highly competitive results compared with BiLSTMs, with BERT and word information further improving segmentation for in-domain and cross-domain segmentation. Our final models give the best results for 6 heterogenous domain benchmarks.

Citations (11)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Investigating Self-Attention Network for Chinese Word Segmentation (1907.11512v1)

Summary

Related Papers