Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 86 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 43 tok/s
GPT-5 High 37 tok/s Pro
GPT-4o 98 tok/s
GPT OSS 120B 466 tok/s Pro
Kimi K2 225 tok/s Pro
2000 character limit reached

Specificity-Based Sentence Ordering for Multi-Document Extractive Risk Summarization (1909.10393v1)

Published 23 Sep 2019 in cs.CL

Abstract: Risk mining technologies seek to find relevant textual extractions that capture entity-risk relationships. However, when high volume data sets are processed, a multitude of relevant extractions can be returned, shifting the focus to how best to present the results. We provide the details of a risk mining multi-document extractive summarization system that produces high quality output by modeling shifts in specificity that are characteristic of well-formed discourses. In particular, we propose a novel selection algorithm that alternates between extracts based on human curated or expanded autoencoded key terms, which exhibit greater specificity or generality as it relates to an entity-risk relationship. Through this extract ordering, and without the need for more complex discourse-aware NLP, we induce felicitous shifts in specificity in the alternating summaries that outperform non-alternating summaries on automatic ROUGE and BLEU scores, and manual understandability and preferences evaluations - achieving no statistically significant difference when compared to human authored summaries.

Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.