Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
132 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Leveraging Contextual Information for Effective Entity Salience Detection (2309.07990v2)

Published 14 Sep 2023 in cs.CL

Abstract: In text documents such as news articles, the content and key events usually revolve around a subset of all the entities mentioned in a document. These entities, often deemed as salient entities, provide useful cues of the aboutness of a document to a reader. Identifying the salience of entities was found helpful in several downstream applications such as search, ranking, and entity-centric summarization, among others. Prior work on salient entity detection mainly focused on machine learning models that require heavy feature engineering. We show that fine-tuning medium-sized LLMs with a cross-encoder style architecture yields substantial performance gains over feature engineering approaches. To this end, we conduct a comprehensive benchmarking of four publicly available datasets using models representative of the medium-sized pre-trained LLM family. Additionally, we show that zero-shot prompting of instruction-tuned LLMs yields inferior results, indicating the task's uniqueness and complexity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. FLAIR: An easy-to-use framework for state-of-the-art NLP. In NAACL 2019, 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pages 54–59.
  2. Automatic keyphrase extraction: a survey and trends. Journal of Intelligent Information Systems, 54(2):391–424.
  3. Longformer: The long-document transformer.
  4. Latent dirichlet allocation. J. Mach. Learn. Res., 3(null):993–1022.
  5. XGBoost: Extreme Gradient Boosting. R package version 0.4-2, 1(4):1–4.
  6. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  7. Broad Twitter corpus: A diverse named entity recognition resource. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1169–1179, Osaka, Japan. The COLING 2016 Organizing Committee.
  8. Crowdsourced corpus with entity salience annotations. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 3307–3311, Portorož, Slovenia. European Language Resources Association (ELRA).
  9. Jesse Dunietz and Daniel Gillick. 2014. A new entity salience task with millions of training examples. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pages 205–209, Gothenburg, Sweden. Association for Computational Linguistics.
  10. Günes Erkan and Dragomir R. Radev. 2004. Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res., 22(1):457–479.
  11. Understanding document aboutness step one: Identifying salient entities. Technical Report MSR-TR-2013-73.
  12. DeBERTav3: Improving deBERTa using ELECTRA-style pre-training with gradient-disentangled embedding sharing. In The Eleventh International Conference on Learning Representations.
  13. Robust disambiguation of named entities in text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 782–792, Edinburgh, Scotland, UK. Association for Computational Linguistics.
  14. Extractive entity-centric summarization as sentence selection using bi-encoders. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 326–333, Online only. Association for Computational Linguistics.
  15. OntoNotes: The 90% solution. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, pages 57–60, New York City, USA. Association for Computational Linguistics.
  16. Xiaolei Huang and Michael J. Paul. 2018. Examining temporality in document classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 694–699, Melbourne, Australia. Association for Computational Linguistics.
  17. Anette Hulth. 2003. Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pages 216–223.
  18. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30.
  19. End-to-end neural entity linking. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 519–529, Brussels, Belgium. Association for Computational Linguistics.
  20. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692.
  21. Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In International Conference on Learning Representations.
  22. EntSUM: A data set for entity-centric extractive summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3355–3366, Dublin, Ireland. Association for Computational Linguistics.
  23. Adding semantics to microblog posts. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, WSDM ’12, page 563–572, New York, NY, USA. Association for Computing Machinery.
  24. Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 404–411, Barcelona, Spain. Association for Computational Linguistics.
  25. Aida-light: High-throughput named-entity disambiguation. In Proceedings of the Workshop on Linked Data on the Web co-located with the 23rd International World Wide Web Conference (WWW 2014), Seoul, Korea, April 8, 2014, volume 1184 of CEUR Workshop Proceedings. CEUR-WS.org.
  26. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
  27. Contextualizing trending entities in news stories. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pages 346–354.
  28. Swat: A system for detecting salient wikipedia entities in texts. Computational Intelligence, 35(4):858–890.
  29. Towards robust linguistic analysis using OntoNotes. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 143–152, Sofia, Bulgaria. Association for Computational Linguistics.
  30. Shruti Rijhwani and Daniel Preotiuc-Pietro. 2020. Temporally-informed analysis of named entity recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7605–7617, Online. Association for Computational Linguistics.
  31. Evan Sandhaus. 2008. The new york times annotated corpus. Linguistic Data Consortium, Philadelphia, 6(12):e26752.
  32. Results of the w-nut 2016 named entity recognition shared task. In Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT), pages 138–144.
  33. Ul2: Unifying language learning paradigms.
  34. BERT rediscovers the classical NLP pipeline. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4593–4601, Florence, Italy. Association for Computational Linguistics.
  35. What do you learn from context? probing for sentence structure in contextualized word representations. In International Conference on Learning Representations.
  36. Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142–147.
  37. Llama 2: Open foundation and fine-tuned chat models.
  38. Sel: A unified algorithm for salient entity linking. Computational Intelligence, 34(1):2–29.
  39. Wn-salience: A corpus of news articles with entity salience annotations. In Proceedings of The 12th Language Resources and Evaluation Conference, pages 2095–2102.
  40. Learning entity-centric document representations using an entity facet topic model. Inf. Process. Manage., 57(3).
  41. Towards better text understanding and retrieval through kernel entity salience modeling. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 575–584.
  42. A bert based sentiment analysis and key entity detection approach for online financial texts. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pages 1233–1238. IEEE.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com