Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MAILEX: Email Event and Argument Extraction (2305.13469v2)

Published 22 May 2023 in cs.CL and cs.AI

Abstract: In this work, we present the first dataset, MailEx, for performing event extraction from conversational email threads. To this end, we first proposed a new taxonomy covering 10 event types and 76 arguments in the email domain. Our final dataset includes 1.5K email threads and ~4K emails, which are annotated with totally ~8K event instances. To understand the task challenges, we conducted a series of experiments comparing three types of approaches, i.e., fine-tuned sequence labeling, fine-tuned generative extraction, and few-shot in-context learning. Our results showed that the task of email event extraction is far from being addressed, due to challenges lying in, e.g., extracting non-continuous, shared trigger spans, extracting non-named entity arguments, and modeling the email conversational history. Our work thus suggests more future investigations in this domain-specific event extraction task.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Saurabh Srivastava (14 papers)
  2. Gaurav Singh (49 papers)
  3. Shou Matsumoto (3 papers)
  4. Ali Raz (1 paper)
  5. Paulo Costa (4 papers)
  6. Joshua Poore (1 paper)
  7. Ziyu Yao (44 papers)
Citations (3)
Github Logo Streamline Icon: https://streamlinehq.com