Persis: A Persian Font Recognition Pipeline Using Convolutional Neural Networks (2310.05255v2)
Abstract: What happens if we encounter a suitable font for our design work but do not know its name? Visual Font Recognition (VFR) systems are used to identify the font typeface in an image. These systems can assist graphic designers in identifying fonts used in images. A VFR system also aids in improving the speed and accuracy of Optical Character Recognition (OCR) systems. In this paper, we introduce the first publicly available datasets in the field of Persian font recognition and employ Convolutional Neural Networks (CNN) to address this problem. The results show that the proposed pipeline obtained 78.0% top-1 accuracy on our new datasets, 89.1% on the IDPL-PFOD dataset, and 94.5% on the KAFD dataset. Furthermore, the average time spent in the entire pipeline for one sample of our proposed datasets is 0.54 and 0.017 seconds for CPU and GPU, respectively. We conclude that CNN methods can be used to recognize Persian fonts without the need for additional pre-processing steps such as feature extraction, binarization, normalization, etc.
- W3Techs. World wide web technology surveys. Accessed Mar. 12, 2022. [Online]. Available: https://w3techs.com
- T. C. Wei, U. U. Sheikh, and A. A.-H. A. Rahman, “Improved optical character recognition with deep neural network,” in 2018 IEEE 14th International Colloquium on Signal Processing & Its Applications (CSPA), 2018, pp. 245–249.
- Y. Zhu, T. Tan, and Y. Wang, “Font recognition based on global texture analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1192–1200, 2001.
- S. Huang, Z. Zhong, L. Jin, S. Zhang, and H. Wang, “Dropregion training of inception font network for high-performance chinese font recognition,” Pattern Recognition, vol. 77, pp. 395–411, 2018.
- H. Luqman, S. A. Mahmoud, and S. Awaida, “Arabic and farsi font recognition: Survey,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 29, no. 01, pp. 1 553 002:1–1 553 002:23, 2015.
- S. La Manna, A. Colia, and A. Sperduti, “Optical font recognition for multi-font ocr and document processing,” in Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99, 1999, pp. 549–553.
- G. Chen, J. Yang, H. Jin, J. Brandt, E. Shechtman, A. Agarwala, and T. X. Han, “Large-scale visual font recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014.
- Z. Wang, J. Yang, H. Jin, E. Shechtman, A. Agarwala, J. Brandt, and T. S. Huang, “Deepfont: Identify your font from an image,” in Proceedings of the 23rd ACM International Conference on Multimedia. New York, NY, USA: Association for Computing Machinery, 2015, p. 451–459.
- S. Izadi, M. Haji, and C. Y. Suen, “A new segmentation algorithm for online handwritten word recognition in persian script,” in Proc. Eleventh International Conf. Frontiers in Handwriting Recognition (CFHR 2008). Citeseer, 2008, pp. 598–603.
- A. Keipour, M. Eshghi, S. M. Ghadikolaei, N. Mohammadi, and S. Ensafi, “Omnifont persian ocr system using primitives,” ArXiv, 2022. [Online]. Available: https://arxiv.org/abs/2202.06371
- S. Pouyanfar, S. Sadiq, Y. Yan, H. Tian, Y. Tao, M. P. Reyes, M.-L. Shyu, S.-C. Chen, and S. S. Iyengar, “A survey on deep learning: Algorithms, techniques, and applications,” ACM Comput. Surv., vol. 51, no. 5, sep 2018.
- D. Gabor, “Theory of communication. part 1: The analysis of information,” Journal of the Institution of Electrical Engineers-Part III: Radio and Communication Engineering, vol. 93, no. 26, pp. 429–441, 1946.
- M. B. Imani, M. R. Keyvanpour, and R. Azmi, “Semi-supervised persian font recognition,” Procedia Computer Science, vol. 3, pp. 336–342, 2011.
- K. Eghbali, H. Veisi, M. Mirzaie, and Y. M. Behbahani, “Font recognition for persian optical character recognition system,” in 2017 10th Iranian Conference on Machine Vision and Image Processing (MVIP). IEEE, 2017, pp. 252–257.
- Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
- E. M. Senobari and H. Khosravi, “Farsi font recognition based on combination of wavelet transform and sobel-robert operator features,” in 2012 2nd International eConference on Computer and Knowledge Engineering (ICCKE). IEEE, 2012, pp. 29–33.
- A. Borji and M. Hamidi, “Support vector machine for persian font recognition,” International Journal of Computer Systems Science and Engineering, vol. 2, no. 3, 2007.
- H. Khosravi and E. Kabir, “Farsi font recognition based on sobel–roberts features,” Pattern Recognition Letters, vol. 31, no. 1, pp. 75–82, 2010.
- W. Khan, “Image segmentation techniques: A survey,” Journal of Image and Graphics, vol. 1, no. 4, pp. 166–170, 2013.
- S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, and D. Terzopoulos, “Image segmentation using deep learning: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3523–3542, 2022.
- O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Springer International Publishing, 2015, pp. 234–241.
- J. C. Ye and W. K. Sung, “Understanding geometry of encoder-decoder cnns,” 2019. [Online]. Available: https://arxiv.org/abs/1901.07647
- K. Xia, J. Huang, and H. Wang, “Lstm-cnn architecture for human activity recognition,” IEEE Access, vol. 8, pp. 56 855–56 866, 2020.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds., vol. 25. Curran Associates, Inc., 2012.
- R. Ramanathan, K. Soman, L. Thaneshwaran, V. Viknesh, T. Arunkumar, and P. Yuvaraj, “A novel technique for english font recognition using support vector machines,” in 2009 International Conference on Advances in Recent Technologies in Communication and Computing, 2009, pp. 766–769.
- M. Ziaratban and F. Bagheri, “Improving farsi font recognition accuracy by using proposed directional elliptic gabor filters,” in 2013 First Iranian Conference on Pattern Recognition and Image Analysis (PRIA). IEEE, 2013, pp. 1–5.
- A. A. Hajiannezhad and S. Mozaffari, “Fractal and multi-fractal dimensions for farsi/arabic font type and size recognition,” in 2011 7th Iranian Conference on Machine Vision and Image Processing. IEEE, 2011, pp. 1–4.
- M. Zahedi and S. Eslami, “Farsi/arabic optical font recognition using sift features,” Procedia Computer Science, vol. 3, pp. 1055–1059, 2011.
- Z. Hossein-Nejad, H. Agahi, and A. Mahmoodzadeh, “Farsi font detection using the adaptive rkem-surf algorithm,” Information Systems & Telecommunication, p. 188, 2020.
- Y. Pourasad, H. Hassibi, and A. Ghorbani, “Farsi font face recognition in letter level,” Procedia Technology, vol. 1, pp. 378–384, 2012.
- Y. Pourasad, H. Hassibi, and A. Ghorbani, “Farsi font recognition using holes of letters and horizontal projection profile,” in International Conference on Innovative Computing Technology. Springer, 2011, pp. 235–243.
- M. Ziaratban and F. Bagheri, “Farsi font recognition based on the fonts of text samples extracted by som,” Journal of Mathematics and Computer Science, vol. 15, pp. 40–56, 07 2015.
- Unsplash. The official unsplash api. Accessed Mar. 15, 2022. [Online]. Available: https://unsplash.com/developers
- Dataheart. Persian text of the shahnameh book. Accessed Mar. 12, 2022. [Online]. Available: http://dataheart.ir
- Bigdata. Persian dictionary. Accessed Mar. 12, 2022. [Online]. Available: https://bigdata-ir.com/
- D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations, 12 2014.
- M. Lin, Q. Chen, and S. Yan, “Network in network,” 2013. [Online]. Available: https://arxiv.org/abs/1312.4400
- H. Luqman, S. A. Mahmoud, and S. Awaida, “Kafd arabic font database,” Pattern Recognition, vol. 47, no. 6, pp. 2231–2240, 2014.
- F. s. Hosseini, S. Kashef, E. Shabaninia, and H. Nezamabadi-pour, “Idpl-pfod: An image dataset of printed farsi text for ocr research,” in Proceedings of The Second International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2021) co-located with ICNLSP 2021. Trento, Italy: Association for Computational Linguistics, 12–13 Nov. 2021, pp. 22–31.
- M. Shayestegan, J. Kohout, K. Štícha, and J. Mareš, “Advanced analysis of 3d kinect data: Supervised classification of facial nerve function via parallel convolutional neural networks,” Applied Sciences, vol. 12, no. 12, 2022.