LaMSUM: Creating Extractive Summaries of User Generated Content using LLMs (2406.15809v2)

Published 22 Jun 2024 in cs.CL and cs.LG

Abstract: LLMs have demonstrated impressive performance across a wide range of NLP tasks, including summarization. LLMs inherently produce abstractive summaries by paraphrasing the original text, while the generation of extractive summaries - selecting specific subsets from the original text - remains largely unexplored. LLMs have a limited context window size, restricting the amount of data that can be processed at once. We tackle this challenge by introducing LaMSUM, a novel multi-level framework designed to generate extractive summaries from large collections of user-generated text using LLMs. LaMSUM integrates summarization with different voting methods to achieve robust summaries. Extensive evaluation using four popular LLMs (Llama 3, Mixtral, Gemini, GPT-4o) demonstrates that LaMSUM outperforms state-of-the-art extractive summarization methods. Overall, this work represents one of the first attempts to achieve extractive summarization by leveraging the power of LLMs, and is likely to spark further interest within the research community.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (5)

Garima Chhikara (3 papers)
Anurag Sharma (6 papers)
V. Gurucharan (4 papers)
Kripabandhu Ghosh (34 papers)
Abhijnan Chakraborty (35 papers)

Citations (1)

View on Semantic Scholar

LaMSUM: Creating Extractive Summaries of User Generated Content using LLMs (2406.15809v2)

Related Papers

Tweets