Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression (2007.08954v2)

Published 17 Jul 2020 in cs.CL, cs.IR, and cs.LG

Abstract: Obtaining training data for multi-document summarization (MDS) is time consuming and resource-intensive, so recent neural models can only be trained for limited domains. In this paper, we propose SummPip: an unsupervised method for multi-document summarization, in which we convert the original documents to a sentence graph, taking both linguistic and deep representation into account, then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary. Experiments on Multi-News and DUC-2004 datasets show that our method is competitive to previous unsupervised methods and is even comparable to the neural supervised approaches. In addition, human evaluation shows our system produces consistent and complete summaries compared to human written ones.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Jinming Zhao (26 papers)
  2. Ming Liu (421 papers)
  3. Longxiang Gao (38 papers)
  4. Yuan Jin (24 papers)
  5. Lan Du (46 papers)
  6. He Zhao (117 papers)
  7. He Zhang (236 papers)
  8. Gholamreza Haffari (141 papers)
Citations (62)