Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automatic Alignment of Discourse Relations of Different Discourse Annotation Frameworks (2403.20196v2)

Published 29 Mar 2024 in cs.CL

Abstract: Existing discourse corpora are annotated based on different frameworks, which show significant dissimilarities in definitions of arguments and relations and structural constraints. Despite surface differences, these frameworks share basic understandings of discourse relations. The relationship between these frameworks has been an open research question, especially the correlation between relation inventories utilized in different frameworks. Better understanding of this question is helpful for integrating discourse theories and enabling interoperability of discourse corpora annotated under different frameworks. However, studies that explore correlations between discourse relation inventories are hindered by different criteria of discourse segmentation, and expert knowledge and manual examination are typically needed. Some semi-automatic methods have been proposed, but they rely on corpora annotated in multiple frameworks in parallel. In this paper, we introduce a fully automatic approach to address the challenges. Specifically, we extend the label-anchored contrastive learning method introduced by Zhang et al. (2022b) to learn label embeddings during a classification task. These embeddings are then utilized to map discourse relations from different frameworks. We show experimental results on RST-DT (Carlson et al., 2001) and PDTB 3.0 (Prasad et al., 2018).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Label-embedding for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(7):1425–1438.
  2. Farah Benamara and Maite Taboada. 2015. Mapping different rhetorical relation annotations: A proposal. In Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics, pages 147–152, Denver, Colorado. Association for Computational Linguistics.
  3. Peter Bourgonje and Olha Zolotarenko. 2019. Toward cross-theory discourse relation annotation. In Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019, pages 7–11, Minneapolis, MN. Association for Computational Linguistics.
  4. Multi-view and multi-task training of RST discourse parsers. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1903–1913, Osaka, Japan. The COLING 2016 Organizing Committee.
  5. Harry Bunt and Rashmi Prasad. 2016. ISO DR-Core (ISO 24617-8): Core concepts for the annotation of discourse relations. In Proceedings 12th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-12), pages 45–54.
  6. Lynn Carlson and Daniel Marcu. 2001. Discourse tagging reference manual. ISI Technical Report ISI-TR-545, 54(2001):56.
  7. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In Proceedings of the Second SIGdial Workshop on Discourse and Dialogue.
  8. Joint learning of hyperbolic label embeddings for hierarchical multi-label classification. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2829–2841, Online. Association for Computational Linguistics.
  9. Christian Chiarcos. 2014. Towards interoperable discourse annotation. discourse features in the ontologies of linguistic annotation. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 4569–4577, Reykjavik, Iceland. European Language Resources Association (ELRA).
  10. Mapping explicit and implicit discourse relations between the RST-DT and the PDTB 3.0. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 344–352, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
  11. How compatible are our discourse annotation frameworks? insights from mapping rst-dt and pdtb annotations. Dialogue & Discourse, 10(1):87–135.
  12. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  13. Yingxue Fu. 2022. Towards unification of discourse annotation frameworks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 132–142, Dublin, Ireland. Association for Computational Linguistics.
  14. Supervised contrastive learning for pre-trained language model fine-tuning. In International Conference on Learning Representations.
  15. Eduard H Hovy and Elisabeth Maier. 1992. Parsimonious or profligate: how many and which discourse structure relations? Technical report, UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES INST.
  16. Yin Jou Huang and Sadao Kurohashi. 2021. Extractive summarization considering discourse and coreference relations based on heterogeneous graph. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3046–3052, Online. Association for Computational Linguistics.
  17. Yangfeng Ji and Jacob Eisenstein. 2014. Representation learning for text-level discourse parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13–24, Baltimore, Maryland. Association for Computational Linguistics.
  18. Yangfeng Ji and Jacob Eisenstein. 2015. One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations. Transactions of the Association for Computational Linguistics, 3:329–344.
  19. Supervised contrastive learning. Advances in Neural Information Processing Systems, 33:18661–18673.
  20. Implicit discourse relation classification: We need to talk about evaluation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5404–5414, Online. Association for Computational Linguistics.
  21. RoBERTa: A robustly optimized BERT pretraining approach.
  22. Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In International Conference on Learning Representations.
  23. William C Mann and Sandra A Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243–281.
  24. Daniel Marcu. 1996. Building up rhetorical structure trees. In Proceedings of the National Conference on Artificial Intelligence, pages 1069–1074.
  25. Daniel Marcu. 2000. The Theory and Practice of Discourse Parsing and Summarization. MIT press.
  26. Label embedding using hierarchical structure of labels for Twitter classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6317–6322, Hong Kong, China. Association for Computational Linguistics.
  27. Karthik Narasimhan and Regina Barzilay. 2015. Machine comprehension with discourse relations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1253–1262, Beijing, China. Association for Computational Linguistics.
  28. Zero-shot learning with semantic output codes. In Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc.
  29. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32.
  30. The Penn Discourse Treebank 2.0 annotation manual.
  31. Rashmi Prasad and Harry Bunt. 2015. Semantic relations in discourse: The current state of ISO 24617-8. In Proceedings of the 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-11), London, UK. Association for Computational Linguistics.
  32. The Penn Discourse TreeBank 2.0. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco. European Language Resources Association (ELRA).
  33. Discourse annotation in the PDTB: The next generation. In Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation, pages 87–97, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
  34. Annotating discourse relations in spoken language: A comparison of the PDTB and CCR frameworks. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 1039–1046, Portorož, Slovenia. European Language Resources Association (ELRA).
  35. Unifying dimensions in coherence relations: How various annotation frameworks are related. Corpus Linguistics and Linguistic Theory.
  36. Tatjana Scheffler and Manfred Stede. 2016. Mapping PDTB-style connective annotation to RST-style discourse annotation. In Proceedings of the 13th Conference on Natural Language Processing, pages 242–247.
  37. Parallel discourse annotations on a corpus of short texts. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 1051–1058, Portorož, Slovenia. European Language Resources Association (ELRA).
  38. Label embedding network: Learning label representation for soft training of deep networks. arXiv preprint arXiv:1710.10393.
  39. Varsha Suresh and Desmond Ong. 2021. Not all negatives are equal: Label-aware contrastive loss for fine-grained text classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4381–4394, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  40. Exploiting discourse relations for sentiment analysis. In Proceedings of COLING 2012: Posters, pages 1311–1320, Mumbai, India. The COLING 2012 Organizing Committee.
  41. Joint embedding of words and labels for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2321–2331, Melbourne, Australia. Association for Computational Linguistics.
  42. Bonnie Webber. 2004. D-LTAG: extending lexicalized TAG to discourse. Cognitive Science, 28(5):751–779. 2003 Rumelhart Prize Special Issue Honoring Aravind K. Joshi.
  43. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
  44. Fusing label embedding into BERT: An efficient improvement for text classification. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1743–1750, Online. Association for Computational Linguistics.
  45. Multi-task label embedding for text classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4545–4553, Brussels, Belgium. Association for Computational Linguistics.
  46. Description-enhanced label embedding contrastive learning for text classification. IEEE Transactions on Neural Networks and Learning Systems.
  47. Temperature as uncertainty in contrastive learning. arXiv preprint arXiv:2110.04403.
  48. Use all the labels: A hierarchical multi-label contrastive learning framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16660–16669.
  49. Label anchored contrastive learning for language understanding. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1437–1449, Seattle, United States. Association for Computational Linguistics.

Summary

We haven't generated a summary for this paper yet.