Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective (2305.10306v3)

Published 17 May 2023 in cs.CL and cs.AI

Abstract: We propose a new paradigm for universal information extraction (IE) that is compatible with any schema format and applicable to a list of IE tasks, such as named entity recognition, relation extraction, event extraction and sentiment analysis. Our approach converts the text-based IE tasks as the token-pair problem, which uniformly disassembles all extraction targets into joint span detection, classification and association problems with a unified extractive framework, namely UniEX. UniEX can synchronously encode schema-based prompt and textual information, and collaboratively learn the generalized knowledge from pre-defined information using the auto-encoder LLMs. We develop a traffine attention mechanism to integrate heterogeneous factors including tasks, labels and inside tokens, and obtain the extraction target via a scoring matrix. Experiment results show that UniEX can outperform generative universal IE models in terms of performance and inference-speed on $14$ benchmarks IE datasets with the supervised setting. The state-of-the-art performance in low-resource scenarios also verifies the transferability and effectiveness of UniEX.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Ping Yang (83 papers)
  2. Junyu Lu (31 papers)
  3. Ruyi Gan (14 papers)
  4. Junjie Wang (164 papers)
  5. Yuxiang Zhang (104 papers)
  6. Jiaxing Zhang (39 papers)
  7. Pingjian Zhang (9 papers)
Citations (7)