Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancement of Bengali OCR by Specialized Models and Advanced Techniques for Diverse Document Types (2402.05158v1)

Published 7 Feb 2024 in cs.CV, cs.AI, and cs.LG

Abstract: This research paper presents a unique Bengali OCR system with some capabilities. The system excels in reconstructing document layouts while preserving structure, alignment, and images. It incorporates advanced image and signature detection for accurate extraction. Specialized models for word segmentation cater to diverse document types, including computer-composed, letterpress, typewriter, and handwritten documents. The system handles static and dynamic handwritten inputs, recognizing various writing styles. Furthermore, it has the ability to recognize compound characters in Bengali. Extensive data collection efforts provide a diverse corpus, while advanced technical components optimize character and word recognition. Additional contributions include image, logo, signature and table recognition, perspective correction, layout reconstruction, and a queuing module for efficient and scalable processing. The system demonstrates outstanding performance in efficient and accurate text extraction and analysis.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. AKM Shahariar Azad Rabby (3 papers)
  2. Hasmot Ali (3 papers)
  3. Md. Majedul Islam (1 paper)
  4. Sheikh Abujar (7 papers)
  5. Fuad Rahman (12 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets