Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Uncovering the Handwritten Text in the Margins: End-to-end Handwritten Text Detection and Recognition (2303.05929v2)

Published 10 Mar 2023 in cs.CV

Abstract: The pressing need for digitization of historical documents has led to a strong interest in designing computerised image processing methods for automatic handwritten text recognition. However, not much attention has been paid on studying the handwritten text written in the margins, i.e. marginalia, that also forms an important source of information. Nevertheless, training an accurate and robust recognition system for marginalia calls for data-efficient approaches due to the unavailability of sufficient amounts of annotated multi-writer texts. Therefore, this work presents an end-to-end framework for automatic detection and recognition of handwritten marginalia, and leverages data augmentation and transfer learning to overcome training data scarcity. The detection phase involves investigation of R-CNN and Faster R-CNN networks. The recognition phase includes an attention-based sequence-to-sequence model, with ResNet feature extraction, bidirectional LSTM-based sequence modeling, and attention-based prediction of marginalia. The effectiveness of the proposed framework has been empirically evaluated on the data from early book collections found in the Uppsala University Library in Sweden. Source code and pre-trained models are available at Github.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Uppsala university library waller collections.
  2. Improving offline htr in small datasets by purging unreliable labels. In 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pages 25–30. IEEE.
  3. Théodore Bluche. 2016. Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. Advances in neural information processing systems, 29.
  4. Scan, attend and read: End-to-end handwritten paragraph recognition with mdlstm attention. In 2017 14th IAPR international conference on document analysis and recognition (ICDAR), volume 1, pages 1050–1055. IEEE.
  5. Melanie Ramdarshan Bold and Kiri L Wagstaff. 2017. Marginalia in the digital age: Are digital reading devices meeting the needs of today’s readers? Library & Information Science Research, 39(1):16–22.
  6. Mia Goodwin. 2021. Locating digitised marginalia. Marginal Notes: Social Reading and the Literal Margins, pages 261–277.
  7. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969.
  8. Gender and writing in melville’s erased marginalia to shakespeare.
  9. Convolve, attend and spell: An attention-based sequence-to-sequence model for handwritten word recognition. In Pattern Recognition: 40th German Conference, GCPR 2018, Stuttgart, Germany, October 9-12, 2018, Proceedings 40, pages 459–472. Springer.
  10. Dmitrijs Kass and Ekta Vats. 2022. Attentionhtr: handwritten text recognition based on attention encoder-decoder networks. In International Workshop on Document Analysis Systems, pages 507–522. Springer.
  11. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.
  12. Visualizing melville’s marginalia: Visualizations.
  13. Trocr: Transformer-based optical character recognition with pre-trained models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 13094–13102.
  14. Robust wide-baseline stereo from maximally stable extremal regions. Image and vision computing, 22(10):761–767.
  15. Understanding the application of handwritten text recognition technology in heritage contexts: a systematic review of transkribus in published research. Archival Science, 22(3):367–392.
  16. Peter Norberg and Steven Olsen-Smith. 2023. The technical development and expanding scope of melville’s marginalia online. Leviathan, 25(2):61–85.
  17. At the axis of reality: Melville’s marginalia in the dramatic works of william shakespeare. Leviathan, 20(2):37–67.
  18. Albert D Pionke. 2020. Handwritten marginalia and digital search: The development and early research results of mill marginalia online. ILCEA. Revue de l’Institut des langues et cultures d’Europe, Amérique, Afrique, Asie et Australie, (39).
  19. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788.
  20. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
  21. Labelme: a database and web-based tool for image annotation. International journal of computer vision, 77:157–173.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Liang Cheng (41 papers)
  2. Jonas Frankemölle (1 paper)
  3. Adam Axelsson (1 paper)
  4. Ekta Vats (11 papers)
Citations (1)