Zero-Shot Cross-Lingual Document-Level Event Causality Identification with Heterogeneous Graph Contrastive Transfer Learning (2403.02893v2)
Abstract: Event Causality Identification (ECI) refers to the detection of causal relations between events in texts. However, most existing studies focus on sentence-level ECI with high-resource languages, leaving more challenging document-level ECI (DECI) with low-resource languages under-explored. In this paper, we propose a Heterogeneous Graph Interaction Model with Multi-granularity Contrastive Transfer Learning (GIMC) for zero-shot cross-lingual document-level ECI. Specifically, we introduce a heterogeneous graph interaction network to model the long-distance dependencies between events that are scattered over a document. Then, to improve cross-lingual transferability of causal knowledge learned from the source language, we propose a multi-granularity contrastive transfer learning module to align the causal representations across languages. Extensive experiments show our framework outperforms the previous state-of-the-art model by 9.4% and 8.2% of average F1 score on monolingual and multilingual scenarios respectively. Notably, in the multilingual scenario, our zero-shot framework even exceeds GPT-3.5 with few-shot learning by 24.3% in overall performance.
- Brandon Beamer and Roxana Girju. 2009. Using a bigram event model to predict causal potential. In International Conference on Intelligent Text Processing and Computational Linguistics, pages 430–441. Springer.
- The life cycle of knowledge in big language models: A survey. Machine Intelligence Research, pages 1–22.
- How attentive are graph attention networks? In International Conference on Learning Representations.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Spanish pre-trained bert model and evaluation data. Pml4dc at iclr, 2020:1–10.
- Knowledge-enriched event causality identification via latent structure induction networks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4862–4872, Online. Association for Computational Linguistics.
- Tommaso Caselli and Piek Vossen. 2017. The event storyline corpus: A new benchmark for causal and temporal relation extraction. In Proceedings of the Events and Stories in the News Workshop, pages 77–86.
- A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109.
- ERGO: Event relational graph transformer for document-level event causality identification. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2118–2128, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- CHEER: Centrality-aware high-order event reasoning network for document-level event causality identification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10804–10816, Toronto, Canada. Association for Computational Linguistics.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
- Eventnarrative: A large-scale event-centric dataset for knowledge graph-to-text generation. arXiv preprint arXiv:2111.00276.
- Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
- Explaining answers with entailment trees. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7358–7370, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Universal stanford dependencies: A cross-linguistic typology. In LREC, volume 14, pages 4585–4592.
- Marie-Catherine De Marneffe and Christopher D Manning. 2008. The stanford typed dependencies representation. In Coling 2008: proceedings of the workshop on cross-framework and cross-domain parser evaluation, pages 1–8.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Towards event-level causal relation identification. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1828–1833.
- Modeling document-level causal structures for event causal relation identification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1808–1817, Minneapolis, Minnesota. Association for Computational Linguistics.
- Semantic structure enhanced event causality identification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10901–10913, Toronto, Canada. Association for Computational Linguistics.
- Not all languages are created equal in llms: Improving multilingual capability by cross-lingual-thought prompting. arXiv preprint arXiv:2305.07004.
- Event causality recognition exploiting multiple annotators’ judgments and background knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5816–5822, Hong Kong, China. Association for Computational Linguistics.
- Chatgpt beyond english: Towards a comprehensive evaluation of large language models in multilingual learning. arXiv preprint arXiv:2304.05613.
- Multimodal pretraining from monolingual to multilingual. Machine Intelligence Research, 20:220–232.
- Knowledge enhanced event causality identification with mention masking generalizations. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pages 3608–3614. International Joint Conferences on Artificial Intelligence Organization. Main track.
- Attention-informed mixed-language training for zero-shot cross-lingual task-oriented dialogue systems. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 8433–8440.
- Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
- Paramita Mirza. 2014. Extracting temporal and causal relations between events. In Proceedings of the ACL 2014 Student Research Workshop, pages 10–17.
- Trankit: A light-weight transformer-based toolkit for multilingual natural language processing. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 80–90, Online. Association for Computational Linguistics.
- OpenAI. 2023. Gpt-4 technical report.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Causation: A user’s guide. Oxford University Press.
- Enhancing event causality identification with event causal label and event pair interaction graph. In Findings of the Association for Computational Linguistics: ACL 2023, pages 10314–10322, Toronto, Canada. Association for Computational Linguistics.
- Gl-clef: A global-local contrastive learning framework for cross-lingual spoken language understanding. arXiv preprint arXiv:2204.08325.
- Cosda-ml: Multi-lingual code-switching data augmentation for zero-shot cross-lingual nlp. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pages 3853–3860. International Joint Conferences on Artificial Intelligence Organization. Main track.
- Searching news articles using an event knowledge graph leveraged by wikidata. In Companion proceedings of the 2019 world wide web conference, pages 1232–1239.
- Sebastian Schuster and Christopher D Manning. 2016. Enhanced english universal dependencies: An improved representation for natural language understanding tasks. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 2371–2378.
- Stefan Schweter. 2020. Berturk - bert models for turkish.
- 5.5: An open multilingual graph of general knowledge. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (December 2016), pages 4444–4451.
- Evaluation of chatgpt as a question answering system for answering complex questions. arXiv preprint arXiv:2303.07992.
- Minh Tran Phu and Thien Huu Nguyen. 2021. Graph convolutional networks for event causality identification with rich document-level structures. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3480–3490, Online. Association for Computational Linguistics.
- Cross-lingual summarization via chatgpt. arXiv preprint arXiv:2302.14229.
- HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2369–2380, Brussels, Belgium. Association for Computational Linguistics.
- Hotpotqa: A dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600.
- Discriminative reasoning with sparse event representation for document-level event-event relation extraction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16222–16234.
- M3exam: A multilingual, multimodal, multilevel benchmark for examining large language models. arXiv preprint arXiv:2306.05179.
- Document-level event causality identification via graph inference mechanism. Information Sciences, 561:115–129.
- Augmentation, retrieval, generation: Event sequence prediction with a three-stage sequence-to-sequence approach. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1865–1874.
- Improving event causality identification via self-supervised representation learning on external causal statement. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2162–2172, Online. Association for Computational Linguistics.
- MECI: A multilingual dataset for event causality identification. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2346–2356, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.