Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Preliminary Empirical Study on Prompt-based Unsupervised Keyphrase Extraction (2405.16571v1)

Published 26 May 2024 in cs.CL

Abstract: Pre-trained LLMs can perform natural language processing downstream tasks by conditioning on human-designed prompts. However, a prompt-based approach often requires "prompt engineering" to design different prompts, primarily hand-crafted through laborious trial and error, requiring human intervention and expertise. It is a challenging problem when constructing a prompt-based keyphrase extraction method. Therefore, we investigate and study the effectiveness of different prompts on the keyphrase extraction task to verify the impact of the cherry-picked prompts on the performance of extracting keyphrases. Extensive experimental results on six benchmark keyphrase extraction datasets and different pre-trained LLMs demonstrate that (1) designing complex prompts may not necessarily be more effective than designing simple prompts; (2) individual keyword changes in the designed prompts can affect the overall performance; (3) designing complex prompts achieve better performance than designing simple prompts when facing long documents.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Semeval 2017 task 10: Scienceie - extracting keyphrases and relations from scientific publications. In SemEval@ACL, pages 546–555. Association for Computational Linguistics.
  2. Simple unsupervised keyphrase extraction using sentence embeddings. In CoNLL, pages 221–229. Association for Computational Linguistics.
  3. Florian Boudin. 2018. Unsupervised keyphrase extraction with multipartite graphs. In NAACL-HLT (2), pages 667–672. Association for Computational Linguistics.
  4. Topicrank: Graph-based topic ranking for keyphrase extraction. In IJCNLP, pages 543–551. Asian Federation of Natural Language Processing / ACL.
  5. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
  6. Yake! collection-independent automatic keyword extractor. In ECIR, volume 10772 of Lecture Notes in Computer Science, pages 806–810. Springer.
  7. Scaling instruction-finetuned language models. CoRR, abs/2210.11416.
  8. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT (1), pages 4171–4186. Association for Computational Linguistics.
  9. Kazi Saidul Hasan and Vincent Ng. 2014. Automatic keyphrase extraction: A survey of the state of the art. In ACL (1), pages 1262–1273. The Association for Computer Linguistics.
  10. Anette Hulth. 2003. Improved automatic keyword extraction given more linguistic knowledge. In EMNLP.
  11. Karen Spärck Jones. 2004. A statistical interpretation of term specificity and its application in retrieval. J. Documentation, 60(5):493–502.
  12. Semeval-2010 task 5 : Automatic keyphrase extraction from scientific articles. In SemEval@ACL, pages 21–26. The Association for Computer Linguistics.
  13. Promptrank: Unsupervised keyphrase extraction using prompt. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 9788–9801. Association for Computational Linguistics.
  14. M. Krapivin and M. Marchese. 2009. Large dataset for keyphrase extraction.
  15. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv., 55(9):195:1–195:35.
  16. Roberta: A robustly optimized bert pretraining approach. CoRR, abs/1907.11692.
  17. Rada Mihalcea and Paul Tarau. 2004. Textrank: Bringing order into text. In EMNLP, pages 404–411. ACL.
  18. Thuy Dung Nguyen and Min-Yen Kan. 2007. Keyphrase extraction in scientific publications. In ICADL, volume 4822 of Lecture Notes in Computer Science, pages 317–326. Springer.
  19. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1–140:67.
  20. Keygames: A game theoretic approach to automatic keyphrase extraction. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2037–2048.
  21. Hyperbolic relevance matching for neural keyphrase extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022, pages 5710–5720. Association for Computational Linguistics.
  22. A preliminary exploration of extractive multi-document summarization in hyperbolic space. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, pages 4505–4509. ACM.
  23. Utilizing BERT intermediate layers for unsupervised keyphrase extraction. In 5th International Conference on Natural Language and Speech Processing, ICNLSP 2022, Trento, Italy, December 16-17, 2022, pages 277–281. Association for Computational Linguistics.
  24. Hisum: Hyperbolic interaction model for extractive multi-document summarization. In Proceedings of the ACM Web Conference 2023, WWW 2023, Austin, TX, USA, 30 April 2023 - 4 May 2023, pages 1427–1436. ACM.
  25. A survey on recent advances in keyphrase extraction from pre-trained language models. In Findings of the Association for Computational Linguistics: EACL 2023, Dubrovnik, Croatia, May 2-6, 2023, pages 2108–2119. Association for Computational Linguistics.
  26. Large language models as zero-shot keyphrase extractors: A preliminary empirical study. CoRR, abs/2312.15156.
  27. Unsupervised keyphrase extraction by learning neural keyphrase set function. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2482–2494. Association for Computational Linguistics.
  28. Is chatgpt A good keyphrase generator? A preliminary study. CoRR, abs/2303.13001.
  29. Importance Estimation from Multiple Perspectives for Keyphrase Extraction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2726–2736. Association for Computational Linguistics.
  30. Improving embedding-based unsupervised keyphrase extraction by incorporating structural information. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1041–1048. Association for Computational Linguistics.
  31. HyperRank: Hyperbolic ranking model for unsupervised keyphrase extraction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 16070–16080, Singapore. Association for Computational Linguistics.
  32. Learning to extract from multiple perspectives for neural keyphrase extraction. Comput. Speech Lang., 81:101502.
  33. Mitigating over-generation for unsupervised keyphrase extraction with heterogeneous centrality detection. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 16349–16359, Singapore. Association for Computational Linguistics.
  34. Counting-stars: A multi-evidence, position-aware, and scalable benchmark for evaluating long-context large language models.
  35. Capturing global informativeness in open domain keyphrase extraction. In CCF International Conference on Natural Language Processing and Chinese Computing, pages 275–287. Springer.
  36. Sifrank: A new baseline for unsupervised keyphrase extraction based on pre-trained language model. IEEE Access, 8:10896–10906.
  37. Xiaojun Wan and Jianguo Xiao. 2008. Single document keyphrase extraction using neighborhood knowledge. In AAAI, pages 855–860. AAAI Press.
  38. Fast and constrained absent keyphrase generation by prompt-based learning. In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022, pages 11495–11503. AAAI Press.
  39. MDERank: A masked document embedding rank approach for unsupervised keyphrase extraction. In Findings of the Association for Computational Linguistics: ACL 2022, pages 396–409, Dublin, Ireland. Association for Computational Linguistics.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets