Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes (2310.00558v3)

Published 1 Oct 2023 in cs.CV

Abstract: When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter which achieves comparable or superior performance over existing text spotting architectures for both regular and arbitrary-shaped scene text spotting benchmarks in terms of both accuracy and model efficiency. The dataset, code and pre-trained models will be released upon acceptance.

Authors (4)

Alloy Das (6 papers)
Sanket Biswas (31 papers)
Umapada Pal (80 papers)
Josep Lladós (40 papers)

Citations (3)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/sanket10rony/status/1790864690479866315

Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes (2310.00558v3)

Summary

Related Papers

Tweets