Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content (2404.10305v2)

Published 16 Apr 2024 in cs.CV

Abstract: The automatic recognition of tabular data in document images presents a significant challenge due to the diverse range of table styles and complex structures. Tables offer valuable content representation, enhancing the predictive capabilities of various systems such as search engines and Knowledge Graphs. Addressing the two main problems, namely table detection (TD) and table structure recognition (TSR), has traditionally been approached independently. In this research, we propose an end-to-end pipeline that integrates deep learning models, including DETR, CascadeTabNet, and PP OCR v2, to achieve comprehensive image-based table recognition. This integrated approach effectively handles diverse table styles, complex structures, and image distortions, resulting in improved accuracy and efficiency compared to existing methods like Table Transformers. Our system achieves simultaneous table detection (TD), table structure recognition (TSR), and table content recognition (TCR), preserving table structures and accurately extracting tabular data from document images. The integration of multiple models addresses the intricacies of table recognition, making our approach a promising solution for image-based table understanding, data extraction, and information retrieval applications. Our proposed approach achieves an IOU of 0.96 and an OCR Accuracy of 78%, showcasing a remarkable improvement of approximately 25% in the OCR Accuracy compared to the previous Table Transformer approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Current Status and Performance Analysis of Table Recognition in Document Images with Deep Neural Networks. , arXiv–2104 pages.
  2. Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade R-CNN: Delving into High-Quality Object Detection. , 6154–6162 pages. https://doi.org/10.1109/CVPR.2018.00638
  3. End-to-End Object Detection with Transformers. arXiv:2005.12872 https://arxiv.org/pdf/2005.12872.pdf
  4. The benefits of close-domain fine-tuning for table detection in document images. , 199–215 pages.
  5. PP-LCNet: A Lightweight CPU Convolutional Neural Network. arXiv:2101.05759 https://arxiv.org/pdf/2101.05759.pdf
  6. ICDAR 2019 Competition on Table Detection and Recognition (cTDaR), April 2019.
  7. PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System.
  8. Table detection using deep learning. , 771–776 pages.
  9. ICDAR 2013 table competition. , 1449–1453 pages.
  10. A table detection method for pdf documents based on convolutional neural networks. , 287–292 pages.
  11. PingAn-VCGroup’s Solution for ICDAR 2021 Competition on Scientific Table Image Recognition to Latex.
  12. Table understanding in structured documents. , 158–164 pages.
  13. Katsuhiko Itonori. 1993. Table structure recognition based on textblock arrangement and ruled line position. , 765–768 pages.
  14. Holistic design for deep learning-based discovery of tabular structures in datasheet images. Engineering Applications of Artificial Intelligence 90 (2020), 103551.
  15. ICDAR 2021 competition on scientific table image recognition to LaTeX. , 754–766 pages.
  16. Thomas G Kieninger. 1998. Table structure recognition based on robust block segmentation. , 22–32 pages.
  17. Multi-modal retrieval of tables and texts using tri-encoder models.
  18. TableBank: A Benchmark Dataset for Table Detection and Recognition. arXiv:1903.01949 [cs.CV]
  19. Shape robust text detection with progressive scale expansion network.
  20. Master: Multi-aspect non-local network for scene text recognition. Pattern Recognition 117 (2021), 107980.
  21. Rethinking Image-based Table Recognition Using Weakly Supervised Methods.
  22. Document structure analysis algorithms: a literature survey. Document recognition and retrieval X 5010 (2003), 197–207.
  23. TableFormer: table structure understanding with transformers. CoRR abs/2203.01017 (2022).
  24. Duc-Dung Nguyen. 2022. TableSegNet: a fully convolutional network for table detection and segmentation in document images. International Journal on Document Analysis and Recognition (IJDAR) 25, 1 (2022), 1–14.
  25. CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents. arXiv:2004.12629 [cs.CV]
  26. Table structure recognition using top-down and bottom-up cues. , 70–86 pages.
  27. Document Structure Extraction using Prior-based High-Resolution Hierarchical Semantic Segmentation. , 649–666 pages.
  28. Deepdesrt: Deep learning for detection and structure recognition of tables in document images. , 1162–1167 pages.
  29. Decnt: Deep deformable cnn for table detection. IEEE access 6 (2018), 74151–74161.
  30. Brandon Smock and Rohith Pesala. 2021. Table Transformer. https://github.com/microsoft/table-transformer
  31. Table detection from document image using vertical arrangement of text blocks. International Journal of Contents 11, 4 (2015), 77–85.
  32. Deep learning for the detection of tabular information from electronic component datasheets. , 6 pages.
  33. Attention Is All You Need. arXiv:1706.03762 [cs.CL]
  34. Deep High-Resolution Representation Learning for Visual Recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, IEEE Transactions on Pattern Analysis and Machine Intelligence, Long Beach, CA, USA, 6961–6969. https://doi.org/10.1109/CVPR.2019.00707
  35. Table structure understanding and its performance evaluation. Pattern Recognition 37, 7 (2004), 1479–1497. https://doi.org/10.1016/j.patcog.2004.01.012
  36. PingAn-VCGroup’s Solution for ICDAR 2021 Competition on Scientific Literature Parsing Task B: Table Recognition to HTML. arXiv:2105.01848 [cs.CV]
  37. YOLO-table: disclosure document table detection with involution. International Journal on Document Analysis and Recognition (IJDAR) 26, 1 (2023), 14.
  38. Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context. arXiv:2005.00589 [cs.CV]
  39. Image-based table recognition: data, model, and evaluation. arXiv:1911.10683 [cs.CV]
Citations (2)

Summary

We haven't generated a summary for this paper yet.