Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Title2Event: Benchmarking Open Event Extraction with a Large-scale Chinese Title Dataset (2211.00869v1)

Published 2 Nov 2022 in cs.CL

Abstract: Event extraction (EE) is crucial to downstream tasks such as new aggregation and event knowledge graph construction. Most existing EE datasets manually define fixed event types and design specific schema for each of them, failing to cover diverse events emerging from the online text. Moreover, news titles, an important source of event mentions, have not gained enough attention in current EE research. In this paper, We present Title2Event, a large-scale sentence-level dataset benchmarking Open Event Extraction without restricting event types. Title2Event contains more than 42,000 news titles in 34 topics collected from Chinese web pages. To the best of our knowledge, it is currently the largest manually-annotated Chinese dataset for open event extraction. We further conduct experiments on Title2Event with different models and show that the characteristics of titles make it challenging for event extraction, addressing the significance of advanced study on this problem. The dataset and baseline codes are available at https://open-event-hub.github.io/title2event.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Haolin Deng (2 papers)
  2. Yanan Zhang (39 papers)
  3. Yangfan Zhang (6 papers)
  4. Wangyang Ying (19 papers)
  5. Changlong Yu (22 papers)
  6. Jun Gao (267 papers)
  7. Wei Wang (1793 papers)
  8. Xiaoling Bai (4 papers)
  9. Nan Yang (182 papers)
  10. Jin Ma (64 papers)
  11. Xiang Chen (343 papers)
  12. Tianhua Zhou (6 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.