Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Abstractive Summarization of Reddit Posts with Multi-level Memory Networks (1811.00783v2)

Published 2 Nov 2018 in cs.CL

Abstract: We address the problem of abstractive summarization in two directions: proposing a novel dataset and a new model. First, we collect Reddit TIFU dataset, consisting of 120K posts from the online discussion forum Reddit. We use such informal crowd-generated posts as text source, in contrast with existing datasets that mostly use formal documents as source such as news articles. Thus, our dataset could less suffer from some biases that key sentences usually locate at the beginning of the text and favorable summary candidates are already inside the text in similar forms. Second, we propose a novel abstractive summarization model named multi-level memory networks (MMN), equipped with multi-level memory to store the information of text from different levels of abstraction. With quantitative evaluation and user studies via Amazon Mechanical Turk, we show the Reddit TIFU dataset is highly abstractive and the MMN outperforms the state-of-the-art summarization models.

Abstractive Summarization of Reddit Posts Using Multi-level Memory Networks

In the research paper titled "Abstractive Summarization of Reddit Posts using Multi-level Memory Networks," the authors Kim, Kim, and Gunhee Kim propose innovative methodologies and data resources to advance the field of abstractive text summarization. The paper presents a dual contribution to the domain: the creation of a large-scale dataset from Reddit posts and the introduction of a novel model architecture.

Traditional datasets for summarization often consist of structured, formal documents like news articles. These sources inherently possess extractive biases, where key sentences tend to occur at the beginning of the document, allowing extractive models to perform relatively well by relying only on locational features or paraphrasing sentences within the text. Recognizing this limitation, the authors collected data from the informal and diverse user-generated content on Reddit's TIFU subreddit. This choice of source provides an important shift, offering a corpus that challenges extractive methods due to its lack of structural homogeneity and the absence of sentences visually similar to prescribed summaries. The resultant Reddit TIFU dataset comprises approximately 122,933 pairs, with each post accompanied by long and short summaries created by the original authors.

To process and summarize such an abstractive and varied dataset, the authors propose the Multi-level Memory Networks (MMN), which departs from prevalent seq2seq models with RNN-based architectures. The MMN is distinguished by its use of multi-level memory networks, designed to store representations of text across different abstraction layers: word-level, sentence-level, paragraph-level, and document-level. This architecture leverages dilated convolution operations, augmented by normalized gated tanh units, essentially allowing the model to retrieve and analyze text at varying granularities without losing context—a known limitation with traditional RNNs and their variants, especially over long sequences.

The MMN's ability to capture long-term dependencies across different scales demonstrated superior performance. The empirical results reveal its effectiveness on multiple datasets, including Reddit TIFU, Newsroom-Abs, and XSum, outperforming leading abstractive summarization baselines such as PG, DRGD, and SEASS concerning ROUGE scores and perplexity. Notably, the MMN showed significant improvement on datasets with known abstractive challenges, affirming its versatility and robust design for handling informal, varied text.

This approach opens promising avenues for designing summarization models that are adaptable to less structured and non-standard text sources, enhancing the applicability of AI-driven summarization beyond traditional media formats. The paper's insights suggest potential future applications of MMN architectures across other informal digital contexts, such as forums and social media platforms. Future research could explore refining the convolutional mechanisms further, investigating adaptive memory levels or integrating semantic understanding to ensure even tighter summaries with improved coherence and fluency.

Overall, this research marks a significant stride towards resolving the inherent biases prevalent in current summarization datasets and methods, proposing a viable path towards truly abstractive summarization in complex and dynamic textual environments such as social media and interactive forums.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Byeongchang Kim (5 papers)
  2. Hyunwoo Kim (52 papers)
  3. Gunhee Kim (74 papers)
Citations (175)