Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

WikiAsp: A Dataset for Multi-domain Aspect-based Summarization (2011.07832v1)

Published 16 Nov 2020 in cs.CL

Abstract: Aspect-based summarization is the task of generating focused summaries based on specific points of interest. Such summaries aid efficient analysis of text, such as quickly understanding reviews or opinions from different angles. However, due to large differences in the type of aspects for different domains (e.g., sentiment, product features), the development of previous models has tended to be domain-specific. In this paper, we propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization that attempts to spur research in the direction of open-domain aspect-based summarization. Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation. We propose several straightforward baseline models for this task and conduct experiments on the dataset. Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Hiroaki Hayashi (17 papers)
  2. Prashant Budania (1 paper)
  3. Peng Wang (832 papers)
  4. Chris Ackerson (1 paper)
  5. Raj Neervannan (1 paper)
  6. Graham Neubig (342 papers)
Citations (58)

Summary

We haven't generated a summary for this paper yet.