Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Disentangling Specificity for Abstractive Multi-document Summarization (2406.00005v1)

Published 12 May 2024 in cs.IR and cs.AI

Abstract: Multi-document summarization (MDS) generates a summary from a document set. Each document in a set describes topic-relevant concepts, while per document also has its unique contents. However, the document specificity receives little attention from existing MDS approaches. Neglecting specific information for each document limits the comprehensiveness of the generated summaries. To solve this problem, in this paper, we propose to disentangle the specific content from documents in one document set. The document-specific representations, which are encouraged to be distant from each other via a proposed orthogonal constraint, are learned by the specific representation learner. We provide extensive analysis and have interesting findings that specific information and document set representations contribute distinctive strengths and their combination yields a more comprehensive solution for the MDS. Also, we find that the common (i.e. shared) information could not contribute much to the overall performance under the MDS settings. Implemetation codes are available at https://github.com/congboma/DisentangleSum.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Congbo Ma (23 papers)
  2. Wei Emma Zhang (46 papers)
  3. Hu Wang (79 papers)
  4. Haojie Zhuang (3 papers)
  5. Mingyu Guo (53 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets