Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The distribution of discourse relations within and across turns in spontaneous conversation (2307.03645v1)

Published 7 Jul 2023 in cs.CL

Abstract: Time pressure and topic negotiation may impose constraints on how people leverage discourse relations (DRs) in spontaneous conversational contexts. In this work, we adapt a system of DRs for written language to spontaneous dialogue using crowdsourced annotations from novice annotators. We then test whether discourse relations are used differently across several types of multi-utterance contexts. We compare the patterns of DR annotation within and across speakers and within and across turns. Ultimately, we find that different discourse contexts produce distinct distributions of discourse relations, with single-turn annotations creating the most uncertainty for annotators. Additionally, we find that the discourse relation annotations are of sufficient quality to predict from embeddings of discourse units.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Discourse parsing for multi-party chat dialogues. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 928–937, Lisbon, Portugal. Association for Computational Linguistics.
  2. Discourse structure and dialogue acts in multiparty dialogue: the STAC corpus. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 2721–2727, Portorož, Slovenia. European Language Resources Association (ELRA).
  3. Nicholas Asher and Alex Lascarides. 2003. Logics of conversation. Cambridge University Press.
  4. Manual for the analysis of settlers data. Strategic Conversation (STAC). Université Paul Sabatier.
  5. Building a discourse-tagged corpus in the framework of rhetorical structure theory. In Current and new directions in discourse and dialogue, pages 85–112. Springer.
  6. Ludivine Crible and Maria-Josep Cuenca. 2017. Discourse markers in speech: distinctive features and corpus annotation. Dialogue and Discourse, 8(2):149–166.
  7. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  8. Switchboard: Telephone speech corpus for research and development. In Proceedings of the 1992 IEEE International Conference on Acoustics, Speech and Signal Processing - Volume 1, ICASSP’92, page 517–520, USA. IEEE Computer Society.
  9. T Florian Jaeger and Neal E Snider. 2013. Alignment as a consequence of expectation adaptation: Syntactic priming is affected by the prime’s prediction error given both prior and recent experience. Cognition, 127(1):57–83.
  10. Discourse complements lexical semantics for non-factoid answer reranking. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 977–986.
  11. Mathias Kraus and Stefan Feuerriegel. 2019. Sentiment analysis based on rhetorical structure theory: Learning deep neural networks from discourse trees. Expert Systems with Applications, 118:65–79.
  12. J Richard Landis and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics, 33 1:159–74.
  13. Stephen C. Levinson and Francisco Torreira. 2015. Timing in turn-taking and its implications for processing models of language. Frontiers in Psychology, 6.
  14. Zhengyuan Liu and Nancy Chen. 2019. Exploiting discourse-level segmentation for extractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, pages 116–121, Hong Kong, China. Association for Computational Linguistics.
  15. William C Mann and Sandra A Thompson. 1987. Rhetorical Structure Theory: A theory of text organization. University of Southern California, Information Sciences Institute Los Angeles.
  16. Establishing annotation quality in multi-label annotations. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3659–3668, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
  17. Thomas Meyer and Andrei Popescu-Belis. 2012. Using sense-labeled discourse connectives for statistical machine translation. In EACL 2012: Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra), CONF, pages 129–138.
  18. Ines Montani and Matthew Honnibal. 2018. Prodigy: A new annotation tool for radically efficient machine teaching. Artificial Intelligence.
  19. The Penn Discourse TreeBank 2.0. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco. European Language Resources Association (ELRA).
  20. Discourse annotation in the PDTB: The next generation. In Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation, pages 87–97, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
  21. Alignment and task success in spoken dialogue. Journal of Memory and Language, 76:29–46.
  22. Craige Roberts. 2012. Information structure: Towards an integrated formal theory of pragmatics. Semantics and pragmatics, 5:6–1.
  23. DiscoGeM: A crowdsourced corpus of genre-mixed implicit discourse relations. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3281–3290, Marseille, France. European Language Resources Association.
  24. Wilbert Spooren and Liesbeth Degand. 2010. Coding coherence relations: Reliability and validity. Corpus Linguistics and Linguistic Theory, 6(2):241–266.
  25. Annotation of discourse relations for conversational spoken dialogs. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Valletta, Malta. European Language Resources Association (ELRA).
  26. Label distributions help implicit discourse relation classification. In Proceedings of the 3rd Workshop on Computational Approaches to Discourse, pages 48–53, Gyeongju, Republic of Korea and Online. International Conference on Computational Linguistics.
  27. Crowdsourcing discourse relation annotations by a two-step connective insertion task. In Proceedings of the 13th Linguistic Annotation Workshop, pages 16–25, Florence, Italy. Association for Computational Linguistics.
  28. Amir Zeldes. 2017. The GUM corpus: Creating multilayer resources in the classroom. Language Resources and Evaluation, 51(3):581–612.
Citations (3)

Summary

We haven't generated a summary for this paper yet.