High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps (1505.04925v1)

Published 19 May 2015 in cs.CV

Abstract: Just like its great success in solving many computer vision problems, the convolutional neural networks (CNN) provided new end-to-end approach to handwritten Chinese character recognition (HCCR) with very promising results in recent years. However, previous CNNs so far proposed for HCCR were neither deep enough nor slim enough. We show in this paper that, a deeper architecture can benefit HCCR a lot to achieve higher performance, meanwhile can be designed with less parameters. We also show that the traditional feature extraction methods, such as Gabor or gradient feature maps, are still useful for enhancing the performance of CNN. We design a streamlined version of GoogLeNet [13], which was original proposed for image classification in recent years with very deep architecture, for HCCR (denoted as HCCR-GoogLeNet). The HCCR-GoogLeNet we used is 19 layers deep but involves with only 7.26 million parameters. Experiments were conducted using the ICDAR 2013 offline HCCR competition dataset. It has been shown that with the proper incorporation with traditional directional feature maps, the proposed single and ensemble HCCR-GoogLeNet models achieve new state of the art recognition accuracy of 96.35% and 96.74%, respectively, outperforming previous best result with significant gap.

Citations (281)

View on Semantic Scholar

Summary

The paper demonstrates a streamlined 19-layer HCCR-GoogLeNet that integrates directional feature maps to enhance offline handwritten Chinese character recognition.
The methodology combines deep CNN architecture with traditional Gabor, gradient, and HoG features, significantly improving recognition accuracy.
Experimental results on the ICDAR 2013 dataset show recognition accuracies up to 96.74%, setting new benchmarks for efficiency and performance in HCCR.

An Examination of Advanced Offline Handwritten Chinese Character Recognition via GoogLeNet and Directional Feature Maps

The paper "High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps" explores the advancement of Handwritten Chinese Character Recognition (HCCR) by leveraging deep Convolutional Neural Networks (CNNs), specifically a modified variant of GoogLeNet. The paper addresses the high complexity associated with the recognition task due to the vast number of classes, diverse handwriting styles, and the intrinsic similarity among many characters.

Methodological Contributions

The authors propose a streamlined version of the GoogLeNet architecture tailored for HCCR (denoted as HCCR-GoogLeNet), which features a 19-layer deep model accommodating just 7.26 million parameters. This signifies a considerable reduction in the model's complexity without compromising its depth or performance. The model design capitalizes on the Inception module from GoogLeNet, allowing a deeper configuration with sustained computational efficiency.

Additionally, the paper emphasizes the effectiveness of traditional feature extraction techniques. Incorporating Gabor, gradient, and Histogram of Oriented Gradients (HoG) feature maps into the CNN's input layer enhances the recognition capability. The Gabor features particularly stand out, providing the most significant improvement in accuracy when integrated into the HCCR-GoogLeNet model.

Results and Evaluation

The experiments, conducted on the ICDAR 2013 offline HCCR competition dataset, reveal that the ensemble models of HCCR-GoogLeNet achieve superior recognition accuracies of 96.64% and 96.74%. These results set new benchmarks in the domain, surpassing previous models, such as the ATR-CNN with a notable margin. The integration of traditional directional feature maps into these CNN architectures demonstrates a clear enhancement over using just the raw pixel data.

Implications and Future Directions

The results established by HCCR-GoogLeNet underscore the potential for combining deep learning techniques with traditional feature extraction strategies to tackle complex pattern recognition challenges. The slim yet deep architecture highlights an essential trajectory for future model developments in domains requiring high accuracy and efficiency, such as mobile applications where computational resources might be constrained.

The noteworthy proposition from the paper is the validation of hybrid approaches in domain-specific applications, urging further investigations into how classical methodologies can refine modern deep learning paradigms. Expanding this research could propel advancements in similar high-classification entities and lead to more lightweight models with robust performance capabilities.

Looking forward, continued refinement of such hybrid models could capitalize on emerging training techniques, novel architectures, and efficient feature extraction methodologies. These developments could potentially close the gap toward achieving human-level accuracy in handwritten Chinese character recognition, opening avenues for practical implementations in real-world applications. As AI progresses, integrating domain-specific insights with deep learning architecture design might well be the key approach for complex tasks involving high variability and extensive class structures.

PDF Markdown