Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (1906.02371v3)

Published 6 Jun 2019 in cs.CV
Omnidirectional Scene Text Detection with Sequential-free Box Discretization

Abstract: Scene text in the wild is commonly presented with high variant characteristics. Using quadrilateral bounding box to localize the text instance is nearly indispensable for detection methods. However, recent researches reveal that introducing quadrilateral bounding box for scene text detection will bring a label confusion issue which is easily overlooked, and this issue may significantly undermine the detection performance. To address this issue, in this paper, we propose a novel method called Sequential-free Box Discretization (SBD) by discretizing the bounding box into key edges (KE) which can further derive more effective methods to improve detection performance. Experiments showed that the proposed method can outperform state-of-the-art methods in many popular scene text benchmarks, including ICDAR 2015, MLT, and MSRA-TD500. Ablation study also showed that simply integrating the SBD into Mask R-CNN framework, the detection performance can be substantially improved. Furthermore, an experiment on the general object dataset HRSC2016 (multi-oriented ships) showed that our method can outperform recent state-of-the-art methods by a large margin, demonstrating its powerful generalization ability. Source code: https://github.com/Yuliang-Liu/Box_Discretization_Network.

Omnidirectional Scene Text Detection with Sequential-free Box Discretization

The paper "Omnidirectional Scene Text Detection with Sequential-free Box Discretization" introduces a method to improve the detection performance of scene text in images with varying orientations and characteristics. Scene text detection has traditionally relied on quadrilateral bounding boxes to localize text, but the authors identify a label confusion issue associated with these bounding boxes that affects detection accuracy.

To address label sequence sensitivity, the authors propose Sequential-free Box Discretization (SBD). Instead of relying on quadrilateral bounding boxes that are sensitive to label sequences, SBD discretizes bounding boxes into "key edges" (KE) that are independent of label order. This approach enhances the robustness of scene text detection frameworks, including popular methods like Mask R-CNN, by reducing label-induced confusion.

Methodology

SBD is integrated into a detection framework, where it learns invariant points key to accurately predicting bounding box coordinates without sequence dependency. Match-type learning is applied to determine the correct matching of x-KEs and y-KEs, ensuring accurate edge predictions. The authors further refine detection confidence via a rescoring mechanism that considers both classification and localization confidences, mitigating false positive rates by leveraging KE predictions.

Experimental Results

The authors validate SBD on several prominent scene text detection benchmarks, including ICDAR 2015, MLT, and MSRA-TD500. The results demonstrate the efficacy of SBD, with significant improvements in recall and precision across datasets. For instance, the method achieved a harmonic mean (Hmean) of 86.5% on the ICDAR 2015 dataset, outperforming other state-of-the-art methods. Additionally, the generalization capability of SBD is illustrated through experiments on the HRSC2016 dataset, where SBD achieved substantial improvements over existing object detection methodologies for multi-oriented objects.

Implications and Future Directions

The Sequential-free Box Discretization approach presents theoretical and practical advancements in handling label sequence sensitivity, benefiting not only scene text detection but also potential applications in general object detection tasks where orientation and varying object characteristics present challenges. Moving forward, further exploration might focus on optimizing computational efficiency and expanding the application of SBD to other object detection frameworks, potentially integrating with novel architectures in AI research for even broader utility.

By offering a robust alternative to traditional bounding box methods, SBD heralds an opportunity for enhancing detection precision in complex image scenarios, thus supporting advancements in automated reading systems, augmented reality applications, and enhanced human-computer interaction interfaces.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yuliang Liu (82 papers)
  2. Sheng Zhang (212 papers)
  3. Lianwen Jin (116 papers)
  4. Lele Xie (8 papers)
  5. Yaqiang Wu (12 papers)
  6. Zhepeng Wang (35 papers)
Citations (89)