Filtered Semi-Markov CRF (2311.18028v1)
Abstract: Semi-Markov CRF has been proposed as an alternative to the traditional Linear Chain CRF for text segmentation tasks such as Named Entity Recognition (NER). Unlike CRF, which treats text segmentation as token-level prediction, Semi-CRF considers segments as the basic unit, making it more expressive. However, Semi-CRF suffers from two major drawbacks: (1) quadratic complexity over sequence length, as it operates on every span of the input sequence, and (2) inferior performance compared to CRF for sequence labeling tasks like NER. In this paper, we introduce Filtered Semi-Markov CRF, a variant of Semi-CRF that addresses these issues by incorporating a filtering step to eliminate irrelevant segments, reducing complexity and search space. Our approach is evaluated on several NER benchmarks, where it outperforms both CRF and Semi-CRF while being significantly faster. The implementation of our method is available on \href{https://github.com/urchade/Filtered-Semi-Markov-CRF}{Github}.
- Contextual string embeddings for sequence labeling. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1638–1649, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Galen Andrew. 2006. A hybrid Markov/semi-Markov conditional random field for sequence segmentation. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 465–472, Sydney, Australia. Association for Computational Linguistics.
- AraBERT: Transformer-based model for Arabic language understanding. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pages 9–15, Marseille, France. European Language Resource Association.
- Beam-width prediction for efficient context-free parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 440–449, Portland, Oregon, USA. Association for Computational Linguistics.
- Caio Corro. 2023. A dynamic programming algorithm for span-based nested named-entity recognition in o(n2)𝑜superscript𝑛2o(n^{2})italic_o ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10712–10724, Toronto, Canada. Association for Computational Linguistics.
- Conditional random field with high-order dependencies for sequence labeling and segmentation. Journal of Machine Learning Research, 15(28):981–1009.
- Hal Daumé and Daniel Marcu. 2005. Learning as search optimization: approximate large margin methods for structured prediction. Proceedings of the 22nd international conference on Machine learning.
- Towards automation of topic taxonomy construction. In Advances in Intelligent Data Analysis XX: 20th International Symposium on Intelligent Data Analysis, IDA 2022, Rennes, France, April 20–22, 2022, Proceedings, page 26–38, Berlin, Heidelberg. Springer-Verlag.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Trinh Minh Tri Do and Thierry Artières. 2010. Neural conditional random fields. In AISTATS.
- ArabIE: Joint entity, relation and event extraction for Arabic. In Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP), pages 331–345, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Jr. G. Forney. 2010. Viterbi algorithm. In Encyclopedia of Machine Learning.
- SpanNER: Named entity re-/recognition as span prediction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 7183–7195, Online. Association for Computational Linguistics.
- AllenNLP: A deep semantic natural language processing platform. In Proceedings of Workshop for NLP Open Source Software (NLP-OSS), pages 1–6, Melbourne, Australia. Association for Computational Linguistics.
- Training conditional random fields for maximum labelwise accuracy. In Advances in Neural Information Processing Systems, volume 19. MIT Press.
- Bidirectional lstm-crf models for sequence tagging.
- Pre-training of hidden-unit CRFs. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 192–198, Beijing, China. Association for Computational Linguistics.
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- Segmental recurrent neural networks. CoRR, abs/1511.06018.
- Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, page 282–289, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
- Neural architectures for named entity recognition. In North American Chapter of the Association for Computational Linguistics.
- Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension.
- Haizhou Li and Baosheng Yuan. 1998. Chinese word segmentation. In Proceedings of the 12th Pacific Asia Conference on Language, Information and Computation, pages 212–217, Singapore. Chinese and Oriental Languages Information Processing Society.
- Empirical analysis of unlabeled entity problem in named entity recognition. In International Conference on Learning Representations.
- Percy Liang. 2005. Semi-supervised learning for natural language.
- On the problem of finding all maximum weight independent sets in interval and circular-arc graphs. In [Proceedings] 1991 Symposium on Applied Computing, pages 465–470.
- Exploring segment representations for neural segmentation models. In IJCAI.
- Exploring segment representations for neural segmentation models. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16, page 2880–2886. AAAI Press.
- Nested named entity recognition as latent lexicalized constituency parsing. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6183–6198, Dublin, Ireland. Association for Computational Linguistics.
- Hiroki Nakayama. 2018. seqeval: A python framework for sequence labeling evaluation. Software available from https://github.com/chakki-works/seqeval.
- Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035.
- L.R. Rabiner. 1989. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286.
- Lev Ratinov and Dan Roth. 2009. Design challenges and misconceptions in named entity recognition. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009), pages 147–155, Boulder, Colorado. Association for Computational Linguistics.
- Brian Roark and Kristy Hollingshead. 2008. Classifying chart cells for quadratic complexity context-free inference. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pages 745–752, Manchester, UK. Coling 2008 Organizing Committee.
- Dan Roth and Wen tau Yih. 2005. Integer linear programming inference for conditional random fields. Proceedings of the 22nd international conference on Machine learning.
- Alexander Rush. 2020. Torch-struct: Deep structured prediction library. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 335–342, Online. Association for Computational Linguistics.
- Alexander Rush and Slav Petrov. 2012. Vine pruning for efficient multi-pass dependency parsing. In The 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL ’12), page Best Paper Award.
- Sunita Sarawagi and William W Cohen. 2005. Semi-markov conditional random fields for information extraction. In Advances in Neural Information Processing Systems, volume 17. MIT Press.
- Parallel instance query network for named entity recognition. In ACL.
- Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142–147.
- Hidden-unit conditional random fields. In AISTATS.
- Tim Vieira and Jason Eisner. 2017. Learning to prune: Exploring the frontier of fast and accurate parsing. Transactions of the Association for Computational Linguistics, 5:263–278.
- A. Viterbi. 1967. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2):260–269.
- Martin J. Wainwright and Michael I. Jordan. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1–2):1–305.
- Ace 2005 multilingual training corpus.
- OntoNotes Release 5.0.
- A unified generative framework for various NER subtasks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5808–5822, Online. Association for Computational Linguistics.
- Conditional random fields with high-order features for sequence labeling. In Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc.
- Zhixiu Ye and Zhen-Hua Ling. 2018. Hybrid semi-Markov CRF for neural sequence labeling. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 235–240, Melbourne, Australia. Association for Computational Linguistics.
- Named entity recognition as dependency parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6470–6476, Online. Association for Computational Linguistics.
- Global span selection for named entity recognition. In Proceedings of the Workshop on Unimodal and Multimodal Induction of Linguistic Structures (UM-IoS), pages 11–17, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- GNNer: Reducing overlapping in span-based NER using graph neural networks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 97–103, Dublin, Ireland. Association for Computational Linguistics.
- Named entity recognition as structured span prediction. In Proceedings of the Workshop on Unimodal and Multimodal Induction of Linguistic Structures (UM-IoS), pages 1–10, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- An autoregressive text-to-graph framework for joint entity and relation extraction. In ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling.
- Taxogen: Unsupervised topic taxonomy construction by adaptive term embedding and clustering. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’18, page 2701–2709, New York, NY, USA. Association for Computing Machinery.
- Enwei Zhu and Jinpeng Li. 2022. Boundary smoothing for named entity recognition. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7096–7108, Dublin, Ireland. Association for Computational Linguistics.
- Segment-level sequence modeling using gated recursive semi-Markov conditional random fields. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1413–1423, Berlin, Germany. Association for Computational Linguistics.