Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

From Ultra-Fine to Fine: Fine-tuning Ultra-Fine Entity Typing Models to Fine-grained (2312.06188v1)

Published 11 Dec 2023 in cs.CL

Abstract: For the task of fine-grained entity typing (FET), due to the use of a large number of entity types, it is usually considered too costly to manually annotating a training dataset that contains an ample number of examples for each type. A common way to address this problem is to use distantly annotated training data that contains incorrect labels. However, the performance of models trained solely with such data can be limited by the errors in the automatic annotation. Recently, there are a few approaches that no longer follow this conventional way. But without using sufficient direct entity typing supervision may also cause them to yield inferior performance. In this paper, we propose a new approach that can avoid the need of creating distantly labeled data whenever there is a new type schema. We first train an entity typing model that have an extremely board type coverage by using the ultra-fine entity typing data. Then, when there is a need to produce a model for a newly designed fine-grained entity type schema. We can simply fine-tune the previously trained model with a small number of examples annotated under this schema. Experimental results show that our approach achieves outstanding performance for FET under the few-shot setting. It can also outperform state-of-the-art weak supervision based methods after fine-tuning the model with only a small size manually annotated training set.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Improving distantly-supervised entity typing with compact latent space clustering. In Proceedings of NAACL-HLT, pages 2862–2872.
  2. Ultra-fine entity typing. In Proceedings of ACL, pages 87–96.
  3. Ultra-fine entity typing with weak supervision from a masked language model. In Proceedings of ACL-IJCNLP, page 1790.
  4. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186.
  5. Prompt-learning for fine-grained entity typing. arXiv preprint arXiv:2108.10604.
  6. Few-nerd: A few-shot named entity recognition dataset. In Proceedings of ACL-IJCNLP, pages 3198–3213.
  7. Context-dependent fine-grained entity type tagging. arXiv preprint arXiv:1412.1820.
  8. Few-shot fine-grained entity typing with automatic label interpretation and instance generation. In Proceedings of ACM SIGKDD, pages 605–614.
  9. Recall, expand and multi-candidate cross-encode: Fast and accurate ultra-fine entity typing. arXiv preprint arXiv:2212.09125.
  10. A chinese corpus for fine-grained entity typing. In Proceedings of LREC, pages 4451–4457.
  11. Ultra-fine entity typing with indirect supervision from natural language inference. Transactions of the Association for Computational Linguistics, 10:607–622.
  12. Ying Lin and Heng Ji. 2019. An attentive fine-grained entity typing model with latent type representation. In Proceedings of EMNLP-IJCNLP, pages 6198–6203.
  13. Design challenges for entity linking. Transactions of the Association for Computational Linguistics, 3:315–328.
  14. Xiao Ling and Daniel S Weld. 2012. Fine-grained entity recognition. In Proceedings of AAAI, volume 12, pages 94–100.
  15. Modeling fine-grained entity types with box embeddings. arXiv preprint arXiv:2101.00345.
  16. Yasumasa Onoe and Greg Durrett. 2019. Learning to denoise distantly-labeled data for entity typing. In Proceedings of NAACL-HLT, pages 2407–2417.
  17. Yasumasa Onoe and Greg Durrett. 2020. Interpretable entity representations through large-scale typing. In Proceedings of EMNLP, pages 612–624.
  18. Automatic noisy label correction for fine-grained entity typing. arXiv preprint arXiv:2205.03011.
  19. Divide and denoise: Learning from noisy labels in fine-grained entity typing with cluster-wise loss correction. In Proceedings of ACL, pages 1997–2006.
  20. Label noise reduction in entity typing by heterogeneous partial-label embedding. In Proceedings of ACM SIGKDD, pages 1825–1834.
  21. Improving broad-coverage medical entity linking with semantic type prediction and large-scale datasets. Journal of biomedical informatics, 121:103880.
  22. Attention is all you need. Advances in NIPS, 30.
  23. Ralph Weischedel and Ada Brunstein. 2005. BBN pronoun coreference and entity type corpus. Linguistic Data Consortium, Philadelphia.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Hongliang Dai (13 papers)
  2. Ziqian Zeng (32 papers)
Citations (2)