2000 character limit reached
Sentence Compression in Spanish driven by Discourse Segmentation and Language Models (1212.3493v2)
Published 14 Dec 2012 in cs.CL and cs.IR
Abstract: Previous works demonstrated that Automatic Text Summarization (ATS) by sentences extraction may be improved using sentence compression. In this work we present a sentence compressions approach guided by level-sentence discourse segmentation and probabilistic LLMs (LM). The results presented here show that the proposed solution is able to generate coherent summaries with grammatical compressed sentences. The approach is simple enough to be transposed into other languages.