Papers
Topics
Authors
Recent
2000 character limit reached

A Probabilistic Framework for Lexicon-based Keyword Spotting in Handwritten Text Images

Published 9 Apr 2021 in cs.IR | (2104.04556v1)

Abstract: Query by String Keyword Spotting (KWS) is here considered as a key technology for indexing large collections of handwritten text images to allow fast textual access to the contents of these collections. Under this perspective, a probabilistic framework for lexicon-based KWS in text images is presented. The presentation aims at providing a tutorial view that helps to understand the relations between classical statements of KWS and the relative challenges entailed by these statements. More specifically, the development of the proposed framework makes it self-evident that word recognition or classification implicitly or explicitly underlies any formulation of KWS. Moreover, it clearly suggests that the same statistical models and training methods successfully used for handwriting text recognition can advantageously be used also for KWS, even though KWS does not generally require or rely on any kind of previously produced image transcripts. These ideas are developed into a specific, probabilistically sound approach for segmentation-free, lexicon-based, query-by-string KWS. Experiments carried out using this approach are presented, which support the consistency and general interest of the proposed framework. Several datasets, traditionally used for KWS benchmarking are considered, with results significantly better than those previously published for these datasets. In addition, results on two new, larger handwritten text image datasets are reported, showing the great potential of the methods proposed in this paper for indexing and textual search in large collections of handwritten documents.

Citations (9)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.