Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HonestBait: Forward References for Attractive but Faithful Headline Generation (2306.14828v1)

Published 26 Jun 2023 in cs.CL and cs.AI

Abstract: Current methods for generating attractive headlines often learn directly from data, which bases attractiveness on the number of user clicks and views. Although clicks or views do reflect user interest, they can fail to reveal how much interest is raised by the writing style and how much is due to the event or topic itself. Also, such approaches can lead to harmful inventions by over-exaggerating the content, aggravating the spread of false information. In this work, we propose HonestBait, a novel framework for solving these issues from another aspect: generating headlines using forward references (FRs), a writing technique often used for clickbait. A self-verification process is included during training to avoid spurious inventions. We begin with a preliminary user study to understand how FRs affect user interest, after which we present PANCO1, an innovative dataset containing pairs of fake news with verified news for attractive but faithful news headline generation. Automatic metrics and human evaluations show that our framework yields more attractive results (+11.25% compared to human-written verified news headlines) while maintaining high veracity, which helps promote real information to fight against fake news.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Annalisa Baicchi. 2004. The Cataphoric Indexicality of Titles, pages 17–38.
  2. Jonas Nygaard Blom and Kenneth Reinecke Hansen. 2015. Click bait: Forward-reference as lure in online news headlines. Journal of Pragmatics, 76:87–100.
  3. Re-evaluating the role of BLEU in machine translation research. In 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy. Association for Computational Linguistics.
  4. Supervised learning of universal sentence representations from natural language inference data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 670–680, Copenhagen, Denmark. Association for Computational Linguistics.
  5. Kevin Crowston. 2012. Amazon mechanical turk: A research tool for organizations and information systems scholars. In Shaping the Future of ICT Research. Methods and Approaches, pages 210–221, Berlin, Heidelberg. Springer Berlin Heidelberg.
  6. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  7. Ranking generated summaries by correctness: An interesting but challenging application for natural language inference. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2214–2220, Florence, Italy. Association for Computational Linguistics.
  8. Cohesion in English. Longman, London.
  9. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput., 9(8):1735–1780.
  10. LCSTS: A large scale Chinese short text summarization dataset. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1967–1972, Lisbon, Portugal. Association for Computational Linguistics.
  11. Hooks in the headline: Learning to generate headlines with controlled styles. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5082–5093, Online. Association for Computational Linguistics.
  12. Effective headlines of newspaper articles in a digital environment. Digital Journalism, 5(10):1300–1314.
  13. Analogical reasoning on chinese morphological and semantic relations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 138–143. Association for Computational Linguistics.
  14. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
  15. Yang Liu. 2019. Fine-tune bert for extractive summarization.
  16. George Loewenstein. 1994. The psychology of curiosity: A review and reinterpretation. Psychological Bulletin, 116:75–98.
  17. Improving truthfulness of headline generation. In ACL.
  18. On faithfulness and factuality in abstractive summarization. pages 1906–1919.
  19. Stress test evaluation for natural language inference. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2340–2353, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
  20. A survey on natural language processing for fake news detection. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 6086–6093, Marseille, France. European Language Resources Association.
  21. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, page 311–318, USA. Association for Computational Linguistics.
  22. Prophetnet-x: Large-scale pre-training models for english, chinese, multi-lingual, dialog, and code generation. arXiv preprint arXiv:2104.08006.
  23. ProphetNet: Predicting future n-gram for sequence-to-sequence pre-training. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pages 2401–2410.
  24. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.
  25. Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732.
  26. Natalie Schluter. 2017. The limits of automatic summarisation according to ROUGE. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 41–45, Valencia, Spain. Association for Computational Linguistics.
  27. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1073–1083, Vancouver, Canada. Association for Computational Linguistics.
  28. Fakenewsnet: A data repository with news content, social context and dynamic information for studying fake news on social media. arXiv preprint arXiv:1809.01286.
  29. Attractive or faithful? popularity-reinforced learning for inspired headline generation. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, pages 8910–8917. AAAI press. Publisher Copyright: Copyright © 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; null ; Conference date: 07-02-2020 Through 12-02-2020.
  30. BLEU is not suitable for the evaluation of text simplification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 738–744, Brussels, Belgium. Association for Computational Linguistics.
  31. Polarization and fake news: Early warning of potential misinformation targets. ACM Trans. Web, 13(2).
  32. Learning robust representations by projecting superficial statistics out. In International Conference on Learning Representations.
  33. No metrics are perfect: Adversarial reward learning for visual storytelling. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 899–909, Melbourne, Australia. Association for Computational Linguistics.
  34. Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3–4):229–256.
  35. Clickbait? sensational headline generation with auto-tuned reinforcement learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3065–3075, Hong Kong, China. Association for Computational Linguistics.
  36. Youwen Yang. 2011. A cognitive interpretation of discourse deixis. Theory and Practice in Language Studies, 1:128–135.
  37. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization.
  38. Question headline generation for news articles. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pages 617–626.
  39. Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations.
  40. Bridging the structural gap between encoding and decoding for data-to-text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2481–2491, Online. Association for Computational Linguistics.
  41. Xiang Zhou and Mohit Bansal. 2020. Towards robustifying nli models against lexical dataset biases. pages 8759–8771.
Citations (1)

Summary

We haven't generated a summary for this paper yet.