Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SEMv3: A Fast and Robust Approach to Table Separation Line Detection (2405.11862v1)

Published 20 May 2024 in cs.CV

Abstract: Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines. During the split stage, we introduce a Keypoint Offset Regression (KOR) module, which effectively detects table separation lines by directly regressing the offset of each line relative to its keypoint proposals. Moreover, in the merge stage, we define a series of merge actions to efficiently describe the table structure based on table grids. Extensive ablation studies demonstrate that our proposed KOR module can detect table separation lines quickly and accurately. Furthermore, on public datasets (e.g. WTW, ICDAR-2019 cTDaR Historical and iFLYTAB), SEMv3 achieves state-of-the-art (SOTA) performance. The code is available at https://github.com/Chunchunwumu/SEMv3.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Trace: Table reconstruction aligned to corner and edges. In Document Analysis and Recognition - ICDAR 2023: 17th International Conference, page 472–489, 2023.
  2. Complicated table structure recognition. arXiv preprint arXiv:1908.04729, 2019.
  3. On the properties of neural machine translation: Encoder–decoder approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pages 103–111, October 2014.
  4. Icdar 2019 competition on table detection and recognition (ctdar). In 2019 International Conference on Document Analysis and Recognition (ICDAR), pages 1510–1515, 2019.
  5. A methodology for evaluating algorithms for table understanding in pdf documents. In Proceedings of the 2012 ACM Symposium on Document Engineering, page 45–48, 2012.
  6. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
  7. Mask r-cnn. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2980–2988, 2017.
  8. Improving table structure recognition with visual-alignment sequential coordinate modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11134–11143, June 2023.
  9. Adam: A method for stochastic optimization. Computer Science, 2014.
  10. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25(2), 2012.
  11. Feature pyramid networks for object detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 936–944, 2017.
  12. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
  13. Tsrformer: Table structure recognition with transformers. In Proceedings of the 30th ACM International Conference on Multimedia, pages 6473–6482, 2022.
  14. Show, read and reason: Table structure recognition with flexible context aggregator. In Proceedings of the 29th ACM International Conference on Multimedia, pages 1084–1092, 2021.
  15. Neural collaborative graph machines for table structure recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4533–4542, 2022.
  16. Parsing table structures in the wild. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 944–952, October 2021.
  17. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
  18. Optimized table tokenization for table structure recognition. In Document Analysis and Recognition - ICDAR 2023: 17th International Conference, San José, CA, USA, August 21–26, 2023, Proceedings, Part II, page 37–50, 2023.
  19. Gridformer: Towards accurate table structure recognition via grid prediction. In Proceedings of the 31st ACM International Conference on Multimedia, page 7747–7757, 2023.
  20. Robust table detection and structure recognition from heterogeneous document images. Pattern Recognition, 133:109006, 2023.
  21. Tableformer: Table structure understanding with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4614–4623, June 2022.
  22. Formerge: Recover spanning cells in complex table structure using transformer network. In International Conference on Document Analysis and Recognition, pages 522–534, 2023.
  23. Spatial as deep: Spatial cnn for traffic scene understanding. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, 2018.
  24. Cascadetabnet: An approach for end to end table detection and structure recognition from image-based documents. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 572–573, 2020.
  25. Lgpma: Complicated table structure recognition with local and global pyramid mask alignment. In International conference on document analysis and recognition, pages 99–114, 2021.
  26. Table structure recognition using top-down and bottom-up cues. In Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVIII, page 70–86, 2020.
  27. Deepdesrt: Deep learning for detection and structure recognition of tables in document images. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 2017.
  28. Divide rows and conquer cells: Towards structure recognition for large tables. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 1369–1377, 2023.
  29. Pubtables-1m: Towards comprehensive table extraction from unstructured documents. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4634–4642, June 2022.
  30. Deep splitting and merging for table structure decomposition. In 2019 International Conference on Document Analysis and Recognition (ICDAR), pages 114–121, 2019.
  31. Robust table structure recognition with dynamic queries enhanced detection transformer. Pattern Recognition, 144:109817, 2023.
  32. Table structure recognition with conditional attention. arXiv preprint arXiv:2203.03819, 2022.
  33. Lore: Logical location regression network for table structure recognition. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, 2023.
  34. Layoutlm: Pre-training of text and layout for document image understanding. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1192–1200, 2020.
  35. Res2tim: Reconstruct syntactic structures from table images. In 2019 international conference on document analysis and recognition (ICDAR), pages 749–755, 2019.
  36. Relationship proposal networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5226–5234, 2017.
  37. Split, embed and merge: An accurate table structure recognizer. Pattern Recognition, 126:108565, 2022.
  38. Semv2: Table separation line detection based on instance segmentation. Pattern Recognition, page 110279, 2024.
  39. Global table extractor (gte): A framework for joint table identification and cell structure recognition using visual context. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 697–706, January 2021.
  40. Image-based table recognition: data, model, and evaluation. In European conference on computer vision, pages 564–580, 2020.
  41. A deep semantic segmentation model for image-based table structure recognition. In 2020 15th IEEE International Conference on Signal Processing (ICSP), volume 1, pages 274–280, 2020.
Citations (2)

Summary

We haven't generated a summary for this paper yet.