Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simple Yet Effective Synthetic Dataset Construction for Unsupervised Opinion Summarization (2303.11660v1)

Published 21 Mar 2023 in cs.CL

Abstract: Opinion summarization provides an important solution for summarizing opinions expressed among a large number of reviews. However, generating aspect-specific and general summaries is challenging due to the lack of annotated data. In this work, we propose two simple yet effective unsupervised approaches to generate both aspect-specific and general opinion summaries by training on synthetic datasets constructed with aspect-related review contents. Our first approach, Seed Words Based Leave-One-Out (SW-LOO), identifies aspect-related portions of reviews simply by exact-matching aspect seed words and outperforms existing methods by 3.4 ROUGE-L points on SPACE and 0.5 ROUGE-1 point on OPOSUM+ for aspect-specific opinion summarization. Our second approach, Natural Language Inference Based Leave-One-Out (NLI-LOO) identifies aspect-related sentences utilizing an NLI model in a more general setting without using seed words and outperforms existing approaches by 1.2 ROUGE-L points on SPACE for aspect-specific opinion summarization and remains competitive on other metrics.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Ming Shen (17 papers)
  2. Jie Ma (205 papers)
  3. Shuai Wang (466 papers)
  4. Yogarshi Vyas (16 papers)
  5. Kalpit Dixit (3 papers)
  6. Miguel Ballesteros (70 papers)
  7. Yassine Benajiba (21 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.