Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues (2302.05895v2)

Published 12 Feb 2023 in cs.CL

Abstract: Discourse processing suffers from data sparsity, especially for dialogues. As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained LLMs (PLMs). We investigate multiple tasks for fine-tuning and show that the dialogue-tailored Sentence Ordering task performs best. To locate and exploit discourse information in PLMs, we propose an unsupervised and a semi-supervised method. Our proposals achieve encouraging results on the STAC corpus, with F1 scores of 57.2 and 59.3 for unsupervised and semi-supervised methods, respectively. When restricted to projective trees, our scores improved to 63.3 and 68.1.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Modelling strategic conversation: model, annotation design and corpus. In Proceedings of the 16th Workshop on the Semantics and Pragmatics of Dialogue (Seinedial), Paris.
  2. Discourse parsing for multi-party chat dialogues. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 928–937, Lisbon, Portugal. Association for Computational Linguistics.
  3. Logics of conversation. Cambridge University Press.
  4. Discourse structure and dialogue acts in multiparty dialogue: the STAC corpus. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 2721–2727, Portorož, Slovenia. European Language Resources Association (ELRA).
  5. Data programming for learning discourse structure. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 640–645, Florence, Italy. Association for Computational Linguistics.
  6. Weak supervision for learning discourse structure. In EMNLP.
  7. Regina Barzilay and Mirella Lapata. 2008. Modeling local coherence: An entity-based approach. Computational Linguistics, 34(1):1–34.
  8. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In Proceedings of the Second SIGdial Workshop on Discourse and Dialogue.
  9. Is everything in order? a simple way to order sentences. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10769–10779.
  10. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  11. Jason Eisner. 1996. Three new probabilistic models for dependency parsing: An exploration. In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics.
  12. A survey on dialogue summarization: Recent advances and new frontiers. arXiv preprint arXiv:2107.03175.
  13. Evaluating discourse in structured text representations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 646–653, Florence, Italy. Association for Computational Linguistics.
  14. Discodisco at the disrpt2021 shared task: A system for discourse segmentation, classification, and connective detection. In Proceedings of the 2nd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2021), pages 51–62.
  15. SAMSum corpus: A human-annotated dialogue dataset for abstractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, pages 70–79, Hong Kong, China. Association for Computational Linguistics.
  16. Multi-tasking dialogue comprehension with discourse parsing. In Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, pages 551–561, Shanghai, China. Association for Computational Lingustics.
  17. John Hewitt and Christopher D. Manning. 2019. A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4129–4138, Minneapolis, Minnesota. Association for Computational Linguistics.
  18. Patrick Huber and Giuseppe Carenini. 2019. Predicting discourse structure using distant supervision from sentiment. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2306–2316, Hong Kong, China. Association for Computational Linguistics.
  19. Patrick Huber and Giuseppe Carenini. 2020. MEGA RST discourse treebanks with structure and nuclearity from scalable distant sentiment supervision. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7442–7457, Online. Association for Computational Linguistics.
  20. Patrick Huber and Giuseppe Carenini. 2022. Towards understanding large-scale discourse structures in pre-trained and fine-tuned language models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics.
  21. What does BERT learn about the structure of language? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3651–3657, Florence, Italy. Association for Computational Linguistics.
  22. Multi-turn response selection using dialogue dependency relations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1911–1920.
  23. Training data enrichment for infrequent discourse relations. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2603–2614, Osaka, Japan. The COLING 2016 Organizing Committee.
  24. How can we know what language models know? Transactions of the Association for Computational Linguistics, 8:423–438.
  25. Codra: A novel discriminative framework for rhetorical analysis. Computational Linguistics, 41(3):385–435.
  26. Are pre-trained language models aware of phrases? simple but strong baselines for grammar induction. In International Conference on Learning Representations.
  27. Improving neural rst parsing model with silver agreement subtrees. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1600–1612.
  28. Split or merge: Which is better for unsupervised RST parsing? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5797–5802, Hong Kong, China. Association for Computational Linguistics.
  29. Discourse probing of pretrained language models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3849–3864.
  30. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
  31. Molweni: A challenge multiparty dialogues-based machine reading comprehension dataset with discourse structure. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2642–2652, Barcelona, Spain (Online). International Committee on Computational Linguistics.
  32. Dadgraph: A discourse-aware dialogue graph neural network for multiparty dialogue machine reading comprehension. arXiv preprint arXiv:2104.12377.
  33. Keep meeting summaries on topic: Abstractive multi-modal meeting summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2190–2196, Florence, Italy. Association for Computational Linguistics.
  34. Text-level discourse dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 25–35, Baltimore, Maryland. Association for Computational Linguistics.
  35. DailyDialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 986–995, Taipei, Taiwan. Asian Federation of Natural Language Processing.
  36. Yang Liu and Mirella Lapata. 2018. Learning structured text representations. Transactions of the Association for Computational Linguistics, 6:63–75.
  37. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  38. Zhengyuan Liu and Nancy Chen. 2021. Improving multi-party dialogue discourse parsing via domain integration. In Proceedings of the 2nd Workshop on Computational Approaches to Discourse, pages 122–127, Punta Cana, Dominican Republic and Online. Association for Computational Linguistics.
  39. Discourse indicators for content selection in summaization.
  40. The Ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 285–294, Prague, Czech Republic. Association for Computational Linguistics.
  41. William C Mann and Sandra A Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text-interdisciplinary Journal for the Study of Discourse, 8(3):243–281.
  42. David Mareček and Rudolf Rosa. 2019. From balustrades to pierre vinken: Looking for syntax in transformer self-attentions. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 263–275, Florence, Italy. Association for Computational Linguistics.
  43. Constrained decoding for text-level discourse parsing. In Proceedings of COLING 2012, pages 1883–1900, Mumbai, India. The COLING 2012 Organizing Committee.
  44. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pages 280–290, Berlin, Germany. Association for Computational Linguistics.
  45. Rst parsing from scratch. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1613–1625.
  46. Noriki Nishida and Yuji Matsumoto. 2022. Out-of-domain discourse dependency parsing via bootstrapping: An empirical analysis on its effectiveness and limitation. Transactions of the Association for Computational Linguistics, 10:127–144.
  47. Integer linear programming for discourse parsing. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 99–109, San Diego, California. Association for Computational Linguistics.
  48. Stanza: A python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 101–108.
  49. Joint modeling of content and discourse relations in dialogues. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 974–984, Vancouver, Canada. Association for Computational Linguistics.
  50. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  51. Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 784–789, Melbourne, Australia. Association for Computational Linguistics.
  52. A simplest systematics for the organization of turn taking for conversation. In Studies in the organization of conversational interaction, pages 7–55. Elsevier.
  53. Emanuel A Schegloff. 2007. Sequence organization in interaction: A primer in conversation analysis I, volume 1. Cambridge university press.
  54. Zhouxing Shi and Minlie Huang. 2019. A deep sequential model for discourse parsing on multi-party dialogues. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7007–7014.
  55. A structure self-aware model for discourse parsing on multi-party dialogues. In Proceedings of the Thirtieth International Conference on International Joint Conferences on Artificial Intelligence.
  56. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
  57. Do we really need that many parameters in transformer for extractive summarization? discourse can help ! In Proceedings of the First Workshop on Computational Approaches to Discourse, pages 124–134, Online. Association for Computational Linguistics.
  58. Predicting discourse trees from transformer-based neural summarizers. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4139–4152, Online. Association for Computational Linguistics.
  59. Discourse-aware neural extractive text summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5021–5031.
  60. DIALOGPT : Large-scale generative pre-training for conversational response generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 270–278, Online. Association for Computational Linguistics.
  61. Dialoglm: Pre-trained model for long dialogue understanding and summarization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 11765–11773.
  62. Examining the rhetorical capacities of neural language models. In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 16–32, Online. Association for Computational Linguistics.
Citations (8)

Summary

We haven't generated a summary for this paper yet.