Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer (2202.05508v2)

Published 11 Feb 2022 in cs.CV, cs.CL, and cs.LG

Abstract: Text spotting end-to-end methods have recently gained attention in the literature due to the benefits of jointly optimizing the text detection and recognition components. Existing methods usually have a distinct separation between the detection and recognition branches, requiring exact annotations for the two tasks. We introduce TextTranSpotter (TTS), a transformer-based approach for text spotting and the first text spotting framework which may be trained with both fully- and weakly-supervised settings. By learning a single latent representation per word detection, and using a novel loss function based on the Hungarian loss, our method alleviates the need for expensive localization annotations. Trained with only text transcription annotations on real data, our weakly-supervised method achieves competitive performance with previous state-of-the-art fully-supervised methods. When trained in a fully-supervised manner, TextTranSpotter shows state-of-the-art results on multiple benchmarks.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (6)

Yair Kittenplon (7 papers)
Inbal Lavi (4 papers)
Sharon Fogel (9 papers)
Yarin Bar (2 papers)
R. Manmatha (31 papers)
Pietro Perona (78 papers)

Citations (47)

View on Semantic Scholar

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer (2202.05508v2)

Related Papers