Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cross-document Event Identity via Dense Annotation (2109.06417v1)

Published 14 Sep 2021 in cs.CL

Abstract: In this paper, we study the identity of textual events from different documents. While the complex nature of event identity is previously studied (Hovy et al., 2013), the case of events across documents is unclear. Prior work on cross-document event coreference has two main drawbacks. First, they restrict the annotations to a limited set of event types. Second, they insufficiently tackle the concept of event identity. Such annotation setup reduces the pool of event mentions and prevents one from considering the possibility of quasi-identity relations. We propose a dense annotation approach for cross-document event coreference, comprising a rich source of event mentions and a dense annotation effort between related document pairs. To this end, we design a new annotation workflow with careful quality control and an easy-to-use annotation interface. In addition to the links, we further collect overlapping event contexts, including time, location, and participants, to shed some light on the relation between identity decisions and context. We present an open-access dataset for cross-document event coreference, CDEC-WN, collected from English Wikinews and open-source our annotation toolkit to encourage further research on cross-document tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Adithya Pratapa (10 papers)
  2. Zhengzhong Liu (28 papers)
  3. Kimihiro Hasegawa (5 papers)
  4. Linwei Li (8 papers)
  5. Yukari Yamakawa (2 papers)
  6. Shikun Zhang (82 papers)
  7. Teruko Mitamura (26 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.