Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Discourse Treebanks from Scalable Distant Supervision (2212.06038v1)

Published 18 Oct 2022 in cs.CL

Abstract: Discourse parsing is an essential upstream task in Natural Language Processing with strong implications for many real-world applications. Despite its widely recognized role, most recent discourse parsers (and consequently downstream tasks) still rely on small-scale human-annotated discourse treebanks, trying to infer general-purpose discourse structures from very limited data in a few narrow domains. To overcome this dire situation and allow discourse parsers to be trained on larger, more diverse and domain-independent datasets, we propose a framework to generate "silver-standard" discourse trees from distant supervision on the auxiliary task of sentiment analysis.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Patrick Huber (146 papers)
  2. Giuseppe Carenini (52 papers)