Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model (2304.06248v1)

Published 13 Apr 2023 in cs.CL

Abstract: Universally modeling all typical information extraction tasks (UIE) with one generative LLM (GLM) has revealed great potential by the latest study, where various IE predictions are unified into a linearized hierarchical expression under a GLM. Syntactic structure information, a type of effective feature which has been extensively utilized in IE community, should also be beneficial to UIE. In this work, we propose a novel structure-aware GLM, fully unleashing the power of syntactic knowledge for UIE. A heterogeneous structure inductor is explored to unsupervisedly induce rich heterogeneous structural representations by post-training an existing GLM. In particular, a structural broadcaster is devised to compact various latent trees into explicit high-order forests, helping to guide a better generation during decoding. We finally introduce a task-oriented structure fine-tuning mechanism, further adjusting the learned structures to most coincide with the end-task's need. Over 12 IE benchmarks across 7 tasks our system shows significant improvements over the baseline UIE system. Further in-depth analyses show that our GLM learns rich task-adaptive structural bias that greatly resolves the UIE crux, the long-range dependence issue and boundary identifying. Source codes are open at https://github.com/ChocoWu/LasUIE.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Hao Fei (105 papers)
  2. Shengqiong Wu (36 papers)
  3. Jingye Li (15 papers)
  4. Bobo Li (23 papers)
  5. Fei Li (233 papers)
  6. Libo Qin (77 papers)
  7. Meishan Zhang (70 papers)
  8. Min Zhang (630 papers)
  9. Tat-Seng Chua (360 papers)
Citations (66)

Summary

We haven't generated a summary for this paper yet.