Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Chem-FINESE: Validating Fine-Grained Few-shot Entity Extraction through Text Reconstruction (2401.10189v4)

Published 18 Jan 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Fine-grained few-shot entity extraction in the chemical domain faces two unique challenges. First, compared with entity extraction tasks in the general domain, sentences from chemical papers usually contain more entities. Moreover, entity extraction models usually have difficulty extracting entities of long-tailed types. In this paper, we propose Chem-FINESE, a novel sequence-to-sequence (seq2seq) based few-shot entity extraction approach, to address these two challenges. Our Chem-FINESE has two components: a seq2seq entity extractor to extract named entities from the input sentence and a seq2seq self-validation module to reconstruct the original input sentence from extracted entities. Inspired by the fact that a good entity extraction system needs to extract entities faithfully, our new self-validation module leverages entity extraction results to reconstruct the original input sentence. Besides, we design a new contrastive loss to reduce excessive copying during the extraction process. Finally, we release ChemNER+, a new fine-grained chemical entity extraction dataset that is annotated by domain experts with the ChemNER schema. Experiments in few-shot settings with both ChemNER+ and CHEMET datasets show that our newly proposed framework has contributed up to 8.26% and 6.84% absolute F1-score gains respectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. Graph propagation based data augmentation for named entity recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 110–118, Toronto, Canada. Association for Computational Linguistics.
  2. Gaussian distributed prototypical network for few-shot genomic variant detection. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 26–36, Toronto, Canada. Association for Computational Linguistics.
  3. Semantic parsing with dual learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 51–64, Florence, Italy. Association for Computational Linguistics.
  4. Learning in-context learning for named entity recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13661–13675, Toronto, Canada. Association for Computational Linguistics.
  5. Prompt-based metric learning for few-shot NER. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7199–7212, Toronto, Canada. Association for Computational Linguistics.
  6. Semi-supervised learning for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1965–1974, Berlin, Germany. Association for Computational Linguistics.
  7. CONTaiNER: Few-shot named entity recognition via contrastive learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6338–6353, Dublin, Ireland. Association for Computational Linguistics.
  8. Meta-learning for few-shot named entity recognition. In Proceedings of the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing, pages 44–58, Online. Association for Computational Linguistics.
  9. DualTKB: A Dual Learning Bridge between Text and Knowledge Base. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8605–8616, Online. Association for Computational Linguistics.
  10. Transition-based parsing with stack-transformers. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1001–1007, Online. Association for Computational Linguistics.
  11. Alyson Gamble. 2017. Pubmed central (pmc). The Charleston Advisor, 19(2):48–54.
  12. ACLM: A selective-denoising based generative data augmentation approach for low-resource complex NER. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 104–125, Toronto, Canada. Association for Computational Linguistics.
  13. A sequence-to-sequence approach for document-level relation extraction. In Proceedings of the 21st Workshop on Biomedical Language Processing, pages 10–25, Dublin, Ireland. Association for Computational Linguistics.
  14. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthcare, 3(1).
  15. CycleGT: Unsupervised graph-to-text and text-to-graph generation via cycle training. In Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+), pages 77–88, Dublin, Ireland (Virtual). Association for Computational Linguistics.
  16. Dual learning for machine translation. In Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc.
  17. Can synthetic text help clinical named entity recognition? a study of electronic health records in French. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2320–2338, Dubrovnik, Croatia. Association for Computational Linguistics.
  18. The diminishing returns of masked language models to science. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1270–1283, Toronto, Canada. Association for Computational Linguistics.
  19. Cyclener: An unsupervised training approach for named entity recognition. In Proceedings of the ACM Web Conference 2022, WWW ’22, page 2916–2924, New York, NY, USA. Association for Computing Machinery.
  20. SciREX: A challenge dataset for document-level information extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7506–7516, Online. Association for Computational Linguistics.
  21. Categorical reparameterization with gumbel-softmax. In Poceedings of 5th International Conference on Learning Representations.
  22. Yuna Jeong and Eunhui Kim. 2022. Scideberta: Learning deberta for science technology documents and fine-tuning information extraction tasks. IEEE Access, 10:60805–60813.
  23. Few-shot named entity recognition with entity-level prototypical network enhanced by dispersedly distributed prototypes. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1842–1854, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
  24. Overview of biocreative v bioc track. In Proceedings of the Fifth BioCreative Challenge Evaluation Workshop, Sevilla, Spain, pages 1–9.
  25. The chemdner corpus of chemicals and drugs and its annotation principles. Journal of cheminformatics, 7(1):1–17.
  26. DrBERT: A robust pre-trained model in French for biomedical and clinical domains. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16207–16221, Toronto, Canada. Association for Computational Linguistics.
  27. Joint biomedical entity and relation extraction with knowledge-enhanced collective inference. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6248–6260, Online. Association for Computational Linguistics.
  28. Unsupervised machine translation using monolingual corpora only. In the Sixth International Conference on Learning Representations.
  29. Esther Landhuis. 2016. Scientific literature: Information overload. Nature, 535(7612):457–458.
  30. Good examples make a faster learner: Simple demonstration-based learning for low-resource NER. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2687–2700, Dublin, Ireland. Association for Computational Linguistics.
  31. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
  32. Few-shot named entity recognition via meta-learning. IEEE Transactions on Knowledge and Data Engineering, 34(9):4245–4256.
  33. CodeIE: Large code generation models are better few-shot information extractors. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15339–15353, Toronto, Canada. Association for Computational Linguistics.
  34. Multi-source (pre-)training for cross-domain measurement, unit and context extraction. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 1–25, Toronto, Canada. Association for Computational Linguistics.
  35. Roberta: A robustly optimized bert pretraining approach. Computation and Language Repository, arXiv:1907.11692.
  36. Crossner: Evaluating cross-domain named entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13452–13460.
  37. Ilya Loshchilov and Frank Hutter. 2017. SGDR: stochastic gradient descent with warm restarts. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.
  38. Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations.
  39. Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3219–3232, Brussels, Belgium. Association for Computational Linguistics.
  40. Coarse-to-fine few-shot learning for named entity recognition. In Findings of the Association for Computational Linguistics: ACL 2023, pages 4115–4129, Toronto, Canada. Association for Computational Linguistics.
  41. Decomposed meta-learning for few-shot named entity recognition. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1584–1596, Dublin, Ireland. Association for Computational Linguistics.
  42. Tasnim Mohiuddin and Shafiq Joty. 2019. Revisiting adversarial autoencoder for unsupervised word translation with cycle consistency and improved training. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3857–3867, Minneapolis, Minnesota. Association for Computational Linguistics.
  43. Hardness-guided domain adaptation to recognise biomedical named entities under low-resource scenarios. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4063–4071, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  44. Adversarial robustness of prompt-based few-shot learning for natural language understanding. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2196–2208, Toronto, Canada. Association for Computational Linguistics.
  45. Understanding cross-domain few-shot learning based on domain similarity and few-shot difficulty. In Advances in Neural Information Processing Systems.
  46. Representation learning with contrastive predictive coding. Machine Learning Repository, arXiv:1807.03748.
  47. In-BoXBART: Get instructions into biomedical multi-task learning. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 112–128, Seattle, United States. Association for Computational Linguistics.
  48. A trigger-sense memory flow framework for joint entity and relation extraction. In Proceedings of the Web Conference 2021, WWW ’21, page 1704–1715, New York, NY, USA. Association for Computing Machinery.
  49. brat: a web-based tool for NLP-assisted text annotation. In Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pages 102–107, Avignon, France. Association for Computational Linguistics.
  50. Dual supervised learning for natural language understanding and generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5472–5477, Florence, Italy. Association for Computational Linguistics.
  51. Towards unsupervised language understanding and generation by joint dual learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 671–680, Online. Association for Computational Linguistics.
  52. Fine-grained chemical entity typing with multimodal knowledge representation. In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 1984–1991, Los Alamitos, CA, USA. IEEE Computer Society.
  53. Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142–147.
  54. A generative model for joint natural language understanding and generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1795–1807, Online. Association for Computational Linguistics.
  55. Richard Van Noorden. 2014. Global scientific output doubles every nine years. Nature news blog.
  56. CitationIE: Leveraging the citation graph for scientific information extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 719–731, Online. Association for Computational Linguistics.
  57. Multimedia generative script learning for task planning. In Findings of the Association for Computational Linguistics: ACL 2023, pages 986–1008, Toronto, Canada. Association for Computational Linguistics.
  58. ChemNER: Fine-grained chemistry named entity recognition with ontology-guided distant supervision. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5227–5240, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  59. Learning from language description: Low-shot named entity recognition via decomposed framework. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1618–1630, Punta Cana, Dominican Republic. Association for Computational Linguistics.
  60. Meta self-training for few-shot neural sequence labeling. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, page 1737–1747, New York, NY, USA. Association for Computing Machinery.
  61. Faithful low-resource data-to-text generation through cycle training. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2847–2867, Toronto, Canada. Association for Computational Linguistics.
  62. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
  63. Dual reconstruction: a unifying objective for semi-supervised neural machine translation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2006–2020, Online. Association for Computational Linguistics.
  64. Focusing, bridging and prompting for few-shot nested named entity recognition. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2621–2637, Toronto, Canada. Association for Computational Linguistics.
  65. A unified generative framework for various NER subtasks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5808–5822, Online. Association for Computational Linguistics.
  66. MixPAVE: Mix-prompt tuning for few-shot product attribute value extraction. In Findings of the Association for Computational Linguistics: ACL 2023, pages 9978–9991, Toronto, Canada. Association for Computational Linguistics.
  67. Packed levitated marker for entity and relation extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4904–4917, Dublin, Ireland. Association for Computational Linguistics.
  68. Jointly learning semantic parser and natural language generator via dual information maximization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2090–2101, Florence, Italy. Association for Computational Linguistics.
  69. Fine-grained information extraction from biomedical literature based on knowledge-enriched Abstract Meaning Representation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6261–6270, Online. Association for Computational Linguistics.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com