Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DeepTitle -- Leveraging BERT to generate Search Engine Optimized Headlines (2107.10935v1)

Published 22 Jul 2021 in cs.LG

Abstract: Automated headline generation for online news articles is not a trivial task - machine generated titles need to be grammatically correct, informative, capture attention and generate search traffic without being "click baits" or "fake news". In this paper we showcase how a pre-trained LLM can be leveraged to create an abstractive news headline generator for German language. We incorporate state of the art fine-tuning techniques for abstractive text summarization, i.e. we use different optimizers for the encoder and decoder where the former is pre-trained and the latter is trained from scratch. We modify the headline generation to incorporate frequently sought keywords relevant for search engine optimization. We conduct experiments on a German news data set and achieve a ROUGE-L-gram F-score of 40.02. Furthermore, we address the limitations of ROUGE for measuring the quality of text summarization by introducing a sentence similarity metric and human evaluation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Cristian Anastasiu (1 paper)
  2. Hanna Behnke (2 papers)
  3. Sarah Lück (1 paper)
  4. Viktor Malesevic (1 paper)
  5. Aamna Najmi (1 paper)
  6. Javier Poveda-Panter (1 paper)
Citations (3)

Summary

We haven't generated a summary for this paper yet.