Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction (2103.10213v1)

Published 18 Mar 2021 in cs.AI

Abstract: Scanned receipts OCR and key information extraction (SROIE) represent the processeses of recognizing text from scanned receipts and extracting key texts from them and save the extracted tests to structured documents. SROIE plays critical roles for many document analysis applications and holds great commercial potentials, but very little research works and advances have been published in this area. In recognition of the technical challenges, importance and huge commercial potentials of SROIE, we organized the ICDAR 2019 competition on SROIE. In this competition, we set up three tasks, namely, Scanned Receipt Text Localisation (Task 1), Scanned Receipt OCR (Task 2) and Key Information Extraction from Scanned Receipts (Task 3). A new dataset with 1000 whole scanned receipt images and annotations is created for the competition. In this report we will presents the motivation, competition datasets, task definition, evaluation protocol, submission statistics, performance of submitted methods and results analysis.

Citations (264)

Summary

  • The paper introduces a novel dataset of 1,000 scanned receipts and structures three distinct tasks to benchmark OCR and extraction methods.
  • The evaluation employs metrics like mAP and F1-score, demonstrating high accuracy in text localization and recognition despite challenging inputs.
  • The paper highlights the use of ensemble learning and synthetic data, paving the way for further research in robust key information extraction.

Overview of the ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction

The paper delineates the ICDAR2019 competition on Scanned Receipt OCR and Information Extraction (SROIE), addressing a niche yet commercially significant domain in document analysis. The competition sought to catalyze research and development advancements in SROIE, a field critical for document-intensive industries such as finance and taxation, where automatic processing of receipts can significantly enhance efficiency and accuracy. The paper highlights the development and deployment of a new dataset and the structuring of three distinct tasks that served as the focal point of the competition.

Dataset and Tasks

The dataset curated for the competition comprises 1,000 scanned receipt images, reflecting real-world challenges like poor print quality and complex layouts. These images were annotated to facilitate three primary tasks: Scanned Receipt Text Localization, Scanned Receipt OCR, and Key Information Extraction. Each task was designed to test different aspects of the SROIE workflow and truck cognitive strides in OCR technologies.

  • Task 1 involves the spatial localization of receipt text.
  • Task 2 focuses on the digital text recognition without prior localization data.
  • Task 3 requires the extraction of key information to create structured outputs from unstructured receipt data.

Evaluation and Results

The competition participants developed various algorithms to tackle these tasks, which were assessed using standard measures such as mean Average Precision (mAP) and F1-score, ensuring consistent and objective evaluation across submissions. Evaluations revealed that significant progress has been made in tasks 1 and 2, with multiple submissions achieving high accuracy. Nonetheless, task 3, involving key information extraction, underscored the persistent challenges in developing methodologies capable of high precision in discerning pertinent data fields amidst potentially noisy text.

Methodological Insights

The top-ranking methods demonstrated the emergence of ensemble learning techniques, where different models are combined to enhance performance, particularly in OCR applications. Additionally, the use of synthetic data for training showcases an evolving paradigm among research teams to augment real-world datasets. Task 3 highlighted diverse methodological strategies, employing a mix of heuristics and machine learning models, indicating the nascent stage of this research area and its promising potential for novel solutions.

Implications and Future Directions

The success and participation in the competition underscore the substantial interest in SROIE from both academia and industry sectors. It is evident from the participation and outcomes that while strides have been made in receipt text recognition, there is still ample room for innovation, particularly in reliably extracting structured information.

Looking forward, the paper suggests that continued development of more sophisticated datasets and exploration of hybrid models combining heuristic and machine learning approaches could be beneficial. Further research is encouraged to enhance current algorithms' performance, potentially leveraging advancements in other domains such as NLP and computer vision. Such efforts would not only refine existing applications within the receipt processing field but also bolster a variety of other document analysis applications where automation can significantly enhance operational efficiencies.

In summary, the ICDAR2019 SROIE competition has set a benchmark and provided a fertile ground for research, underscoring the pivotal role competitions play in propelling forward the capabilities of AI technologies in solving practical and large-scale problems.