Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration (2207.06717v1)

Published 14 Jul 2022 in cs.CL and cs.MM

Abstract: Building document-grounded dialogue systems have received growing interest as documents convey a wealth of human knowledge and commonly exist in enterprises. Wherein, how to comprehend and retrieve information from documents is a challenging research problem. Previous work ignores the visual property of documents and treats them as plain text, resulting in incomplete modality. In this paper, we propose a Layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents (VRDs), so as to generate accurate responses in dialogue systems. LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents, becoming the largest VRD-based information extraction dataset to the best of our knowledge. We also develop benchmark methods that extend the token-based LLM to consider layout features like humans. Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zhenyu Zhang (250 papers)
  2. Bowen Yu (89 papers)
  3. Haiyang Yu (109 papers)
  4. Tingwen Liu (45 papers)
  5. Cheng Fu (12 papers)
  6. Jingyang Li (27 papers)
  7. Chengguang Tang (10 papers)
  8. Jian Sun (415 papers)
  9. Yongbin Li (128 papers)
Citations (5)