Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Importance Estimation from Multiple Perspectives for Keyphrase Extraction (2110.09749v5)

Published 19 Oct 2021 in cs.CL and cs.IR

Abstract: Keyphrase extraction is a fundamental task in Natural Language Processing, which usually contains two main parts: candidate keyphrase extraction and keyphrase importance estimation. From the view of human understanding documents, we typically measure the importance of phrase according to its syntactic accuracy, information saliency, and concept consistency simultaneously. However, most existing keyphrase extraction approaches only focus on the part of them, which leads to biased results. In this paper, we propose a new approach to estimate the importance of keyphrase from multiple perspectives (called as \textit{KIEMP}) and further improve the performance of keyphrase extraction. Specifically, \textit{KIEMP} estimates the importance of phrase with three modules: a chunking module to measure its syntactic accuracy, a ranking module to check its information saliency, and a matching module to judge the concept (i.e., topic) consistency between phrase and the whole document. These three modules are seamlessly jointed together via an end-to-end multi-task learning model, which is helpful for three parts to enhance each other and balance the effects of three perspectives. Experimental results on six benchmark datasets show that \textit{KIEMP} outperforms the existing state-of-the-art keyphrase extraction approaches in most cases.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, pages 4171–4186. Association for Computational Linguistics.
  2. Extracting key terms from noisy and multitheme documents. In WWW, pages 661–670. ACM.
  3. Kazi Saidul Hasan and Vincent Ng. 2014. Automatic keyphrase extraction: A survey of the state of the art. In ACL (1), pages 1262–1273. The Association for Computer Linguistics.
  4. Anette Hulth. 2003. Improved automatic keyword extraction given more linguistic knowledge. In EMNLP.
  5. Anette Hulth. 2004. Enhancing linguistically oriented automatic keyword extraction. In HLT-NAACL (Short Papers). The Association for Computational Linguistics.
  6. A ranking approach to keyphrase extraction. In SIGIR, pages 756–757. ACM.
  7. Karen Spärck Jones. 2004. A statistical interpretation of term specificity and its application in retrieval. J. Documentation, 60(5):493–502.
  8. Semeval-2010 task 5 : Automatic keyphrase extraction from scientific articles. In SemEval@ACL, pages 21–26. The Association for Computer Linguistics.
  9. M. Krapivin and M. Marchese. 2009. Large dataset for keyphrase extraction.
  10. Unsupervised approaches for automatic keyword extraction using meeting transcripts. In HLT-NAACL, pages 620–628. The Association for Computational Linguistics.
  11. Yang Liu and Mirella Lapata. 2019. Text summarization with pretrained encoders. In EMNLP/IJCNLP (1), pages 3728–3738. Association for Computational Linguistics.
  12. Roberta: A robustly optimized bert pretraining approach. CoRR, abs/1907.11692.
  13. Automatic keyphrase extraction via topic decomposition. In EMNLP, pages 366–376. ACL.
  14. Clustering to find exemplar terms for keyphrase extraction. In EMNLP, pages 257–266. ACL.
  15. Human-competitive tagging using automatic keyphrase extraction. In Internat. Conference of Empirical Methods in Natural Language Processing, EMNLP-2009,.
  16. Deep keyphrase generation. In ACL, pages 582–592. Association for Computational Linguistics.
  17. Rada Mihalcea and Paul Tarau. 2004. Textrank: Bringing order into text. In EMNLP, pages 404–411. ACL.
  18. Keyphrase extraction with span-based feature representations. CoRR, abs/2002.05407.
  19. Chau Q. Nguyen and Tuoi T. Phan. 2009. An ontology-based approach for key phrase extraction. In ACL/IJCNLP (Short Papers), pages 181–184. The Association for Computer Linguistics.
  20. Thuy Dung Nguyen and Min-Yen Kan. 2007. Keyphrase extraction in scientific publications. In ICADL, volume 4822 of Lecture Notes in Computer Science, pages 317–326. Springer.
  21. Pytorch: An imperative style, high-performance deep learning library. In NeurIPS, pages 8024–8035.
  22. Deep contextualized word representations. In NAACL-HLT, pages 2227–2237. Association for Computational Linguistics.
  23. M.F. Porter. 2006. An algorithm for suffix stripping. Program: Electronic Library and Information Systems, 40(3):211–218.
  24. Stochastic backpropagation and approximate inference in deep generative models. Cite arxiv:1401.4082Comment: Appears In Proceedings of the 31st International Conference on Machine Learning (ICML), JMLR: W&CP volume 32, 2014.
  25. The illusion of multitasking and its positive effect on performance. Psychological Science, 29(12):1942–1955.
  26. Joint keyphrase chunking and salience ranking with bert. CoRR, abs/2004.13639.
  27. Divgraphpointer: A graph pointer network for extracting diverse keyphrases. In SIGIR, pages 755–764. ACM.
  28. Takashi Tomokiyo and Matthew Hurst. 2003. A language model approach to keyphrase extraction. pages 33–40. Association for Computational Linguistics.
  29. Peter D. Turney. 2000. Learning algorithms for keyphrase extraction. Inf. Retr., 2(4):303–336.
  30. Peter D. Turney. 2002. Learning to extract keyphrases from text. CoRR, cs.LG/0212013.
  31. Xiaojun Wan and Jianguo Xiao. 2008. Collabrank: Towards a collaborative approach to single-document keyphrase extraction. In COLING, pages 969–976.
  32. Incorporating multimodal information in open-domain web keyphrase extraction. In EMNLP (1), pages 1790–1800. Association for Computational Linguistics.
  33. Topic-aware neural keyphrase generation for social media language. In ACL (1), pages 2516–2526. Association for Computational Linguistics.
  34. Kea: Practical automatic keyphrase extraction. In ACM DL, pages 254–255. ACM.
  35. Huggingface’s transformers: State-of-the-art natural language processing. CoRR, abs/1910.03771.
  36. Open domain web keyphrase extraction beyond language modeling. In EMNLP/IJCNLP (1), pages 5174–5183. Association for Computational Linguistics.
  37. Hongyuan Zha. 2002. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In SIGIR, pages 113–120. ACM.
  38. Keyphrase extraction with dynamic graph convolutional networks and diversified inference. CoRR, abs/2010.12828.
Citations (28)

Summary

We haven't generated a summary for this paper yet.