Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Document Summarization with Text Segmentation (2301.08817v1)

Published 20 Jan 2023 in cs.CL

Abstract: In this paper, we exploit the innate document segment structure for improving the extractive summarization task. We build two text segmentation models and find the most optimal strategy to introduce their output predictions in an extractive summarization model. Experimental results on a corpus of scientific articles show that extractive summarization benefits from using a highly accurate segmentation method. In particular, most of the improvement is in documents where the most relevant information is not at the beginning thus, we conclude that segmentation helps in reducing the lead bias problem.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Lesly Miculicich (15 papers)
  2. Benjamin Han (9 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.