Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Detecting Curve Text in the Wild: New Dataset and New Solution (1712.02170v1)

Published 6 Dec 2017 in cs.CV
Detecting Curve Text in the Wild: New Dataset and New Solution

Abstract: Scene text detection has been made great progress in recent years. The detection manners are evolving from axis-aligned rectangle to rotated rectangle and further to quadrangle. However, current datasets contain very little curve text, which can be widely observed in scene images such as signboard, product name and so on. To raise the concerns of reading curve text in the wild, in this paper, we construct a curve text dataset named CTW1500, which includes over 10k text annotations in 1,500 images (1000 for training and 500 for testing). Based on this dataset, we pioneering propose a polygon based curve text detector (CTD) which can directly detect curve text without empirical combination. Moreover, by seamlessly integrating the recurrent transverse and longitudinal offset connection (TLOC), the proposed method can be end-to-end trainable to learn the inherent connection among the position offsets. This allows the CTD to explore context information instead of predicting points independently, resulting in more smooth and accurate detection. We also propose two simple but effective post-processing methods named non-polygon suppress (NPS) and polygonal non-maximum suppression (PNMS) to further improve the detection accuracy. Furthermore, the proposed approach in this paper is designed in an universal manner, which can also be trained with rectangular or quadrilateral bounding boxes without extra efforts. Experimental results on CTW-1500 demonstrate our method with only a light backbone can outperform state-of-the-art methods with a large margin. By evaluating only in the curve or non-curve subset, the CTD + TLOC can still achieve the best results. Code is available at https://github.com/Yuliang-Liu/Curve-Text-Detector.

Detecting Curve Text in the Wild: New Dataset and New Solution

The paper "Detecting Curve Text in the Wild: New Dataset and New Solution" presents a significant contribution to the field of scene text detection by introducing a novel approach specifically designed for curve text. The authors address the limitations of existing datasets and methods, which primarily focus on axis-aligned or quadrilateral text regions, by proposing a polygon-based technique and a new dataset named CTW1500.

Dataset and Methodology

CTW1500 is specifically constructed to handle curve text, containing over 10,000 text annotations across 1,500 images. This dataset distinguishes itself through its focus on curve text, a common real-world occurrence that existing datasets inadequately address. The labels utilize a 14-point polygonal annotation system, providing flexibility and precision over traditional bounding boxes.

The proposed Curve Text Detector (CTD) leverages this new dataset, introducing a novel method capable of directly detecting curve text without reliance on empirical combination methods. The approach integrates a recurrent transverse and longitudinal offset connection (TLOC), enhancing the detector's ability to learn context and spatial relationships among the annotated points. This RNN-based connection facilitates more accurate and smooth localization of curve text regions.

Strong Results and Innovative Techniques

Experimental results on CTW1500 reflect the CTD's ability to outperform state-of-the-art methods by a substantial margin, notably with a lightweight backbone such as a reduced ResNet-50. Specifically, the combination of CTD with TLOC excels in both curve and non-curve text subsets, indicating robustness and versatility. Additionally, the introduction of post-processing techniques like non-polygon suppression (NPS) and polygonal non-maximum suppression (PNMS) further refines detection accuracy, reducing false positives and enhancing generalization.

Implications and Future Work

The research presented in this paper holds both practical and theoretical significance. Practically, it provides a robust solution for various applications requiring accurate scene text detection, such as real-time translation and autonomous systems. Theoretically, it suggests a paradigm shift in scene text detection, encouraging further exploration into polygonal-based systems.

Future developments may focus on expanding CTW1500 into a comprehensive recognition dataset, as suggested by the authors, given its current annotation methodology. Moreover, the exploration of detection methods balancing speed and flexibility could further refine the capabilities of curve text detection.

In summary, this work is a valuable addition to the field, providing a novel dataset and methodological framework that addresses an unmet need in detecting curve text in dynamic environments.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Liu Yuliang (1 paper)
  2. Jin Lianwen (1 paper)
  3. Zhang Shuaitao (1 paper)
  4. Zhang Sheng (5 papers)
Citations (238)