Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts (2307.10551v1)

Published 20 Jul 2023 in cs.AI

Abstract: Key Information Extraction (KIE) is a challenging multimodal task that aims to extract structured value semantic entities from visually rich documents. Although significant progress has been made, there are still two major challenges that need to be addressed. Firstly, the layout of existing datasets is relatively fixed and limited in the number of semantic entity categories, creating a significant gap between these datasets and the complex real-world scenarios. Secondly, existing methods follow a two-stage pipeline strategy, which may lead to the error propagation problem. Additionally, they are difficult to apply in situations where unseen semantic entity categories emerge. To address the first challenge, we propose a new large-scale human-annotated dataset named Complex Layout form for key information EXtraction (CLEX), which consists of 5,860 images with 1,162 semantic entity categories. To solve the second challenge, we introduce Parallel Pointer-based Network (PPN), an end-to-end model that can be applied in zero-shot and few-shot scenarios. PPN leverages the implicit clues between semantic entities to assist extracting, and its parallel extraction mechanism allows it to extract multiple results simultaneously and efficiently. Experiments on the CLEX dataset demonstrate that PPN outperforms existing state-of-the-art methods while also offering a much faster inference speed.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Kaiwen Wei (6 papers)
  2. Jie Yao (27 papers)
  3. Jingyuan Zhang (50 papers)
  4. Yangyang Kang (32 papers)
  5. Fubang Zhao (9 papers)
  6. Yating Zhang (21 papers)
  7. Changlong Sun (37 papers)
  8. Xin Jin (285 papers)
  9. Xin Zhang (904 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.