2000 character limit reached
An Effective Transition-based Model for Discontinuous NER (2004.13454v1)
Published 28 Apr 2020 in cs.CL
Abstract: Unlike widely used Named Entity Recognition (NER) data sets in generic domains, biomedical NER data sets often contain mentions consisting of discontinuous spans. Conventional sequence tagging techniques encode Markov assumptions that are efficient but preclude recovery of these mentions. We propose a simple, effective transition-based model with generic neural encoding for discontinuous NER. Through extensive experiments on three biomedical data sets, we show that our model can effectively recognize discontinuous mentions without sacrificing the accuracy on continuous mentions.
- Xiang Dai (18 papers)
- Sarvnaz Karimi (17 papers)
- Ben Hachey (10 papers)
- Cecile Paris (34 papers)