Instruct and Extract: Instruction Tuning for On-Demand Information Extraction (2310.16040v1)

Published 24 Oct 2023 in cs.CL and cs.AI

Abstract: LLMs with instruction-following capabilities open the door to a wider group of users. However, when it comes to information extraction - a classic task in natural language processing - most task-specific systems cannot align well with long-tail ad hoc extraction use cases for non-expert users. To address this, we propose a novel paradigm, termed On-Demand Information Extraction, to fulfill the personalized demands of real-world users. Our task aims to follow the instructions to extract the desired content from the associated text and present it in a structured tabular format. The table headers can either be user-specified or inferred contextually by the model. To facilitate research in this emerging area, we present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set. Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE. Comprehensive evaluations on our benchmark reveal that ODIE substantially outperforms the existing open-source models of similar size. Our code and dataset are released on https://github.com/yzjiao/On-Demand-IE.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (7)

Yizhu Jiao (22 papers)
Ming Zhong (88 papers)
Sha Li (42 papers)
Ruining Zhao (8 papers)
Siru Ouyang (22 papers)
Heng Ji (266 papers)
Jiawei Han (263 papers)

Citations (20)

View on Semantic Scholar

GitHub

GitHub - yzjiao/On-Demand-IE: Code and dataset for the emnlp paper titled Instruct and Extract: Instruction Tuning for On-Demand Information Extraction (52 stars)

Instruct and Extract: Instruction Tuning for On-Demand Information Extraction (2310.16040v1)

Related Papers

GitHub