Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rethinking Negative Instances for Generative Named Entity Recognition (2402.16602v2)

Published 26 Feb 2024 in cs.CL

Abstract: LLMs have demonstrated impressive capabilities for generalizing in unseen tasks. In the Named Entity Recognition (NER) task, recent advancements have seen the remarkable improvement of LLMs in a broad range of entity domains via instruction tuning, by adopting entity-centric schema. In this work, we explore the potential enhancement of the existing methods by incorporating negative instances into training. Our experiments reveal that negative instances contribute to remarkable improvements by (1) introducing contextual information, and (2) clearly delineating label boundaries. Furthermore, we introduce an efficient longest common subsequence (LCS) matching algorithm, which is tailored to transform unstructured predictions into structured entities. By integrating these components, we present GNER, a Generative NER system that shows improved zero-shot performance across unseen entity domains. Our comprehensive evaluation illustrates our system's superiority, surpassing state-of-the-art (SoTA) methods by 9 $F_1$ score in zero-shot evaluation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  2. Contextual string embeddings for sequence labeling. In Proceedings of the 27th international conference on computational linguistics, pages 1638–1649.
  3. Polyglot-ner: Massive multilingual named entity recognition. In Proceedings of the 2015 SIAM International Conference on Data Mining, pages 586–594. SIAM.
  4. Crossroads, buildings and neighborhoods: A dataset for fine-grained location recognition. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3329–3339, Seattle, United States. Association for Computational Linguistics.
  5. Jason PC Chiu and Eric Nichols. 2016. Named entity recognition with bidirectional lstm-cnns. Transactions of the Association for Computational Linguistics, 4:357–370.
  6. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  7. Broad Twitter corpus: A diverse named entity recognition resource. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1169–1179, Osaka, Japan. The COLING 2016 Organizing Committee.
  8. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  9. Ncbi disease corpus: a resource for disease name recognition and concept normalization. Journal of biomedical informatics, 47:1–10.
  10. Spanner: Named entity re-/recognition as span prediction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 7183–7195.
  11. The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027.
  12. Findvehicle and vehiclefinder: A ner dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system. arXiv preprint arXiv:2304.10893.
  13. Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991.
  14. James W Hunt and Thomas G Szymanski. 1977. A fast algorithm for computing longest common subsequences. Communications of the ACM, 20(5):350–353.
  15. The chemdner corpus of chemicals and drugs and its annotation principles. Journal of cheminformatics, 7(1):1–17.
  16. Aman Kumar and Binil Starly. 2022. “fabner”: information extraction from manufacturing process science domain literature using named entity recognition. Journal of Intelligent Manufacturing, 33(8):2393–2407.
  17. Evaluating chatgpt’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness. arXiv preprint arXiv:2304.11633.
  18. Biocreative v cdr task corpus: a resource for chemical disease relation extraction. Database, 2016.
  19. Unified named entity recognition as word-word relation classification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 10965–10973.
  20. A unified mrc framework for named entity recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5849–5859.
  21. Rethinking negative sampling for handling missing entity annotations. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7188–7197.
  22. Empirical analysis of unlabeled entity problem in named entity recognition. In International Conference on Learning Representations.
  23. Asgard: A portable architecture for multilingual dialogue systems. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 8386–8390. IEEE.
  24. Crossner: Evaluating cross-domain named entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13452–13460.
  25. The flan collection: Designing data and methods for effective instruction tuning. arXiv preprint arXiv:2301.13688.
  26. Ilya Loshchilov and Frank Hutter. 2018. Decoupled weight decay regularization. In International Conference on Learning Representations.
  27. Coarse-to-fine pre-training for named entity recognition. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6345–6354.
  28. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  29. Cross-lingual name tagging and linking for 282 languages. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1946–1958, Vancouver, Canada. Association for Computational Linguistics.
  30. Sampo Pyysalo and Sophia Ananiadou. 2014. Anatomical entity mention recognition at literature scale. Bioinformatics, 30(6):868–875.
  31. A stack-propagation framework with token-level intent detection for spoken language understanding. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2078–2087.
  32. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950.
  33. Gollie: Annotation guidelines improve zero-shot information-extraction. arXiv preprint arXiv:2310.03668.
  34. Erik Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142–147.
  35. Overview of biocreative ii gene mention recognition. Genome biology, 9(2):1–19.
  36. WikiNEuRal: Combined neural and knowledge-based silver data creation for multilingual NER. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2521–2533, Punta Cana, Dominican Republic. Association for Computational Linguistics.
  37. Simone Tedeschi and Roberto Navigli. 2022. MultiNERD: A multilingual, multi-genre and fine-grained dataset for named entity recognition (and disambiguation). In Findings of the Association for Computational Linguistics: NAACL 2022, pages 801–812, Seattle, United States. Association for Computational Linguistics.
  38. Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts. In The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, Online. Association for Computational Linguistics.
  39. Instructuie: Multi-task instruction tuning for unified information extraction. arXiv preprint arXiv:2304.08085.
  40. Crossweigh: Training named entity tagger from imperfect annotations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5157–5166.
  41. Finetuned language models are zero-shot learners. In International Conference on Learning Representations.
  42. Zero-shot information extraction via chatting with chatgpt. arXiv preprint arXiv:2302.10205.
  43. Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, Philadelphia, PA, 23:170.
  44. Sustainable ai: Environmental implications, challenges and opportunities. Proceedings of Machine Learning and Systems, 4:795–813.
  45. A unified generative framework for various ner subtasks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5808–5822.
  46. Named entity recognition as dependency parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6470–6476.
  47. Gliner: Generalist model for named entity recognition using bidirectional transformer. arXiv preprint arXiv:2311.08526.
  48. Universalner: Targeted distillation from large language models for open named entity recognition. arXiv preprint arXiv:2308.03279.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yuyang Ding (13 papers)
  2. Juntao Li (89 papers)
  3. Pinzheng Wang (7 papers)
  4. Zecheng Tang (19 papers)
  5. Bowen Yan (24 papers)
  6. Min Zhang (630 papers)
Citations (9)
Github Logo Streamline Icon: https://streamlinehq.com