Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A CNN Based Scene Chinese Text Recognition Algorithm With Synthetic Data Engine (1604.01891v1)

Published 7 Apr 2016 in cs.CV

Abstract: Scene text recognition plays an important role in many computer vision applications. The small size of available public available scene text datasets is the main challenge when training a text recognition CNN model. In this paper, we propose a CNN based Chinese text recognition algorithm. To enlarge the dataset for training the CNN model, we design a synthetic data engine for Chinese scene character generation, which generates representative character images according to the fonts use frequency of Chinese texts. As the Chinese text is more complex, the English text recognition CNN architecture is modified for Chinese text. To ensure the small size nature character dataset and the large size artificial character dataset are comparable in training, the CNN model are trained progressively. The proposed Chinese text recognition algorithm is evaluated with two Chinese text datasets. The algorithm achieves better recognize accuracy compared to the baseline methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Xiaohang Ren (4 papers)
  2. Kai Chen (512 papers)
  3. Jun Sun (210 papers)
Citations (17)