Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Detection Masking for Improved OCR on Noisy Documents (2205.08257v1)

Published 17 May 2022 in cs.CV

Abstract: Optical Character Recognition (OCR), the task of extracting textual information from scanned documents is a vital and broadly used technology for digitizing and indexing physical documents. Existing technologies perform well for clean documents, but when the document is visually degraded, or when there are non-textual elements, OCR quality can be greatly impacted, specifically due to erroneous detections. In this paper we present an improved detection network with a masking system to improve the quality of OCR performed on documents. By filtering non-textual elements from the image we can utilize document-level OCR to incorporate contextual information to improve OCR results. We perform a unified evaluation on a publicly available dataset demonstrating the usefulness and broad applicability of our method. Additionally, we present and make publicly available our synthetic dataset with a unique hard-negative component specifically tuned to improve detection results, and evaluate the benefits that can be gained from its usage

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Daniel Rotman (6 papers)
  2. Ophir Azulai (6 papers)
  3. Inbar Shapira (2 papers)
  4. Yevgeny Burshtein (3 papers)
  5. Udi Barzelay (7 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.