Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting (2008.00714v5)

Published 3 Aug 2020 in cs.CV

Abstract: Scene text spotting aims to detect and recognize the entire word or sentence with multiple characters in natural images. It is still challenging because ambiguity often occurs when the spacing between characters is large or the characters are evenly spread in multiple rows and columns, making many visually plausible groupings of the characters (e.g. "BERLIN" is incorrectly detected as "BERL" and "IN" in Fig. 1(c)). Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection. The proposed AE TextSpotter has three important benefits. 1) The linguistic representation is learned together with the visual representation in a framework. To our knowledge, it is the first time to improve text detection by using a LLM. 2) A carefully designed language module is utilized to reduce the detection confidence of incorrect text lines, making them easily pruned in the detection stage. 3) Extensive experiments show that AE TextSpotter outperforms other state-of-the-art methods by a large margin. For example, we carefully select a validation set of extremely ambiguous samples from the IC19-ReCTS dataset, where our approach surpasses other methods by more than 4%. The code has been released at https://github.com/whai362/AE_TextSpotter. The image list and evaluation scripts of the validation set have been released at https://github.com/whai362/TDA-ReCTS.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Wenhai Wang (123 papers)
  2. Xuebo Liu (54 papers)
  3. Xiaozhong Ji (16 papers)
  4. Enze Xie (84 papers)
  5. Ding Liang (39 papers)
  6. Zhibo Yang (43 papers)
  7. Tong Lu (85 papers)
  8. Chunhua Shen (404 papers)
  9. Ping Luo (340 papers)
Citations (22)