Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring Domain Shift in Extractive Text Summarization (1908.11664v1)

Published 30 Aug 2019 in cs.CL

Abstract: Although domain shift has been well explored in many NLP applications, it still has received little attention in the domain of extractive text summarization. As a result, the model is under-utilizing the nature of the training data due to ignoring the difference in the distribution of training sets and shows poor generalization on the unseen domain. With the above limitation in mind, in this paper, we first extend the conventional definition of the domain from categories into data sources for the text summarization task. Then we re-purpose a multi-domain summarization dataset and verify how the gap between different domains influences the performance of neural summarization models. Furthermore, we investigate four learning strategies and examine their abilities to deal with the domain shift problem. Experimental results on three different settings show their different characteristics in our new testbed. Our source code including \textit{BERT-based}, \textit{meta-learning} methods for multi-domain summarization learning and the re-purposed dataset \textsc{Multi-SUM} will be available on our project: \url{http://pfliu.com/TransferSum/}.

Citations (35)

Summary

We haven't generated a summary for this paper yet.