Text Recognition in Real Scenarios with a Few Labeled Samples (2006.12209v1)

Published 22 Jun 2020 in cs.CV

Abstract: Scene text recognition (STR) is still a hot research topic in computer vision field due to its various applications. Existing works mainly focus on learning a general model with a huge number of synthetic text images to recognize unconstrained scene texts, and have achieved substantial progress. However, these methods are not quite applicable in many real-world scenarios where 1) high recognition accuracy is required, while 2) labeled samples are lacked. To tackle this challenging problem, this paper proposes a few-shot adversarial sequence domain adaptation (FASDA) approach to build sequence adaptation between the synthetic source domain (with many synthetic labeled samples) and a specific target domain (with only some or a few real labeled samples). This is done by simultaneously learning each character's feature representation with an attention mechanism and establishing the corresponding character-level latent subspace with adversarial learning. Our approach can maximize the character-level confusion between the source domain and the target domain, thus achieves the sequence-level adaptation with even a small number of labeled samples in the target domain. Extensive experiments on various datasets show that our method significantly outperforms the finetuning scheme, and obtains comparable performance to the state-of-the-art STR methods.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (6)

Jinghuang Lin (1 paper)
Zhanzhan Cheng (28 papers)
Fan Bai (38 papers)
Yi Niu (38 papers)
Shiliang Pu (106 papers)
Shuigeng Zhou (81 papers)

Citations (3)

View on Semantic Scholar

Text Recognition in Real Scenarios with a Few Labeled Samples (2006.12209v1)

Related Papers