Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer (2202.05508v2)

Published 11 Feb 2022 in cs.CV, cs.CL, and cs.LG

Abstract: Text spotting end-to-end methods have recently gained attention in the literature due to the benefits of jointly optimizing the text detection and recognition components. Existing methods usually have a distinct separation between the detection and recognition branches, requiring exact annotations for the two tasks. We introduce TextTranSpotter (TTS), a transformer-based approach for text spotting and the first text spotting framework which may be trained with both fully- and weakly-supervised settings. By learning a single latent representation per word detection, and using a novel loss function based on the Hungarian loss, our method alleviates the need for expensive localization annotations. Trained with only text transcription annotations on real data, our weakly-supervised method achieves competitive performance with previous state-of-the-art fully-supervised methods. When trained in a fully-supervised manner, TextTranSpotter shows state-of-the-art results on multiple benchmarks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yair Kittenplon (7 papers)
  2. Inbal Lavi (4 papers)
  3. Sharon Fogel (9 papers)
  4. Yarin Bar (2 papers)
  5. R. Manmatha (31 papers)
  6. Pietro Perona (78 papers)
Citations (47)