Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ShopSign: a Diverse Scene Text Dataset of Chinese Shop Signs in Street Views (1903.10412v1)

Published 25 Mar 2019 in cs.CV

Abstract: In this paper, we introduce the ShopSign dataset, which is a newly developed natural scene text dataset of Chinese shop signs in street views. Although a few scene text datasets are already publicly available (e.g. ICDAR2015, COCO-Text), there are few images in these datasets that contain Chinese texts/characters. Hence, we collect and annotate the ShopSign dataset to advance research in Chinese scene text detection and recognition. The new dataset has three distinctive characteristics: (1) large-scale: it contains 25,362 Chinese shop sign images, with a total number of 196,010 text-lines. (2) diversity: the images in ShopSign were captured in different scenes, from downtown to developing regions, using more than 50 different mobile phones. (3) difficulty: the dataset is very sparse and imbalanced. It also includes five categories of hard images (mirror, wooden, deformed, exposed and obscure). To illustrate the challenges in ShopSign, we run baseline experiments using state-of-the-art scene text detection methods (including CTPN, TextBoxes++ and EAST), and cross-dataset validation to compare their corresponding performance on the related datasets such as CTW, RCTW and ICPR 2018 MTWI challenge dataset. The sample images and detailed descriptions of our ShopSign dataset are publicly available at: https://github.com/chongshengzhang/shopsign.

Insights into the ShopSign Dataset: A Comprehensive Resource for Chinese Scene Text Detection and Recognition

The paper "ShopSign: a Diverse Scene Text Dataset of Chinese Shop Signs in Street Views" presents the introduction of a novel dataset aimed at advancing the field of Chinese scene text detection and recognition, herein referred to as Chinese Photo OCR. The ShopSign dataset emerges as an essential contribution given the relatively underexplored domain of Chinese text datasets compared to their English counterparts. This detailed summary examines the dataset's development, characteristics, and implications for future research.

The ShopSign dataset is defined by several pivotal characteristics that distinguish it from existing datasets. First, ShopSign is notable for its scale, containing 25,362 images with 196,010 text lines. This volume makes it a comparable, if not superior, dataset in terms of scope to previously released Chinese scene text datasets. Second, the diversity of the dataset is underlined by the geographic range of image collection, encompassing both developed and developing regions across China. This includes variance in environmental conditions, text orientations, and backgrounds, achieved through using over 50 different mobile devices for capturing images. Third, the dataset encapsulates challenges typical to real-world scenarios by including difficult image categories such as mirror, wooden, deformed, exposed, and obscured texts.

In evaluating the dataset’s utility, baseline experiments were conducted with established scene text detection methods including CTPN, TextBoxes++, and EAST. These methodologies were tested across various challenging categories within the dataset, highlighting ShopSign’s potential to refine and test the robustness of text detection algorithms specifically for Chinese scripts. The experiments indicate that existing models trained on datasets not specifically oriented towards Chinese text perform suboptimally on ShopSign, underscoring the dataset's importance for this language-specific challenge. ShopSign further demonstrates that language-specific factors significantly influence detection performance, validating the need for datasets tailored to the complexities of Chinese characters.

Theoretically, ShopSign not only serves as a foundational dataset for benchmarking but also stimulates dialogue around the unique obstacles inherent in Chinese text recognition, such as handling large character sets and imbalanced data. Practically, this dataset has applications across numerous domains requiring accurate text recognition in natural scenes, including urban planning, autonomous navigation, and digital archiving in Chinese contexts.

Looking towards the future, the creators of ShopSign suggest the potential development of even larger scale synthetic datasets and the application of generative techniques like GANs to generate complex Chinese text scenes. Additionally, addressing the data sparsity and class imbalance within the dataset remains crucial. The publication emphasizes the importance of synthetic dataset creation as a means to support machine learning models in overcoming these challenges and enhancing character recognition capabilities.

In conclusion, ShopSign stands out as a resource meticulously crafted to elevate Chinese scene detection and recognition research. It fills a critical gap within the field, prompting advancements not only through extensive real-world data but also by encouraging the generation and use of synthetic data to support the inherent linguistic complexity found in the Chinese language. The authors hope that the accessibility of ShopSign will drive further innovations and improved methodologies in Chinese Photo OCR.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Chongsheng Zhang (8 papers)
  2. Guowen Peng (2 papers)
  3. Yuefeng Tao (1 paper)
  4. Feifei Fu (1 paper)
  5. Wei Jiang (341 papers)
  6. George Almpanidis (3 papers)
  7. Ke Chen (241 papers)
Citations (5)