DenseBAM-GI: Attention Augmented DeneseNet with momentum aided GRU for HMER (2306.16482v1)
Abstract: The task of recognising Handwritten Mathematical Expressions (HMER) is crucial in the fields of digital education and scholarly research. However, it is difficult to accurately determine the length and complex spatial relationships among symbols in handwritten mathematical expressions. In this study, we present a novel encoder-decoder architecture (DenseBAM-GI) for HMER, where the encoder has a Bottleneck Attention Module (BAM) to improve feature representation and the decoder has a Gated Input-GRU (GI-GRU) unit with an extra gate to make decoding long and complex expressions easier. The proposed model is an efficient and lightweight architecture with performance equivalent to state-of-the-art models in terms of Expression Recognition Rate (exprate). It also performs better in terms of top 1, 2, and 3 error accuracy across the CROHME 2014, 2016, and 2019 datasets. DenseBAM-GI achieves the best exprate among all models on the CROHME 2019 dataset. Importantly, these successes are accomplished with a drop in the complexity of the calculation and a reduction in the need for GPU memory.
- Robert H. Anderson “Syntax-Directed Recognition of Hand-Printed Two-Dimensional Mathematics” In Symposium on Interactive Systems for Experimental Applied Mathematics: Proceedings of the Association for Computing Machinery Inc. Symposium Washington, D.C.: Association for Computing Machinery, 1967, pp. 436–459 DOI: 10.1145/2402536.2402585
- Y. Bengio, P. Simard and P. Frasconi “Learning long-term dependencies with gradient descent is difficult” In IEEE Transactions on Neural Networks 5.2, 1994, pp. 157–166 DOI: 10.1109/72.279181
- “Long Short-Term Memory” In Neural Computation 9.8, 1997, pp. 1735–1780 DOI: 10.1162/neco.1997.9.8.1735
- Richard Zanibbi, Dorothea Blostein and James Cordy “Recognizing mathematical expressions using tree transformation. IEEE Trans Pattern Anal Mach Intell” In Pattern Analysis and Machine Intelligence, IEEE Transactions on 24, 2002, pp. 1455–1467 DOI: 10.1109/TPAMI.2002.1046157
- “Recognition of Online Handwritten Mathematical Expressions” In IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society 34, 2005, pp. 2366–76 DOI: 10.1109/TSMCB.2004.836817
- “On-Line Recognition of Handwritten Mathematical Expressions Based on Stroke-Based Stochastic Context-Free Grammar” http://www.suvisoft.com In Tenth International Workshop on Frontiers in Handwriting Recognition La Baule (France): Suvisoft, 2006 Université de Rennes 1 URL: https://hal.inria.fr/inria-00104743
- Razvan Pascanu, Tomás Mikolov and Yoshua Bengio “Understanding the exploding gradient problem” In CoRR abs/1211.5063, 2012 arXiv: http://arxiv.org/abs/1211.5063
- Francisco Álvaro, Joan-Andreu Sánchez and José-Miguel Benedí “Recognition of on-line handwritten mathematical expressions using 2D stochastic context-free grammars and hidden Markov models” In Pattern Recognition Letters 35, 2014 DOI: 10.1016/j.patrec.2012.09.023
- “Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation”, 2014 DOI: 10.3115/v1/D14-1179
- “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling” In CoRR abs/1412.3555, 2014 arXiv: http://arxiv.org/abs/1412.3555
- “ICFHR 2014 Competition on Recognition of On-line Handwritten Mathematical Expressions (CROHME 2014)” In Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR 2014, 2014 DOI: 10.1109/ICFHR.2014.42
- Francisco Álvaro, Joan-Andreu Sánchez and José-Miguel Benedí “An Integrated Grammar-based Approach for Mathematical Expression Recognition” In Pattern Recognition 51, 2015 DOI: 10.1016/j.patcog.2015.09.013
- Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Bengio “Neural Machine Translation by Jointly Learning to Align and Translate” In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015 URL: http://arxiv.org/abs/1409.0473
- “Attention-Based Models for Speech Recognition” In CoRR abs/1506.07503, 2015 arXiv: http://arxiv.org/abs/1506.07503
- “Deep Residual Learning for Image Recognition”, 2015 arXiv:1512.03385 [cs.CV]
- Thang Luong, Hieu Pham and Christopher D. Manning “Effective Approaches to Attention-based Neural Machine Translation” In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing Lisbon, Portugal: Association for Computational Linguistics, 2015, pp. 1412–1421 DOI: 10.18653/v1/D15-1166
- “Very Deep Convolutional Networks for Large-Scale Image Recognition”, 2015 arXiv:1409.1556 [cs.CV]
- Yuntian Deng, Anssi Kanervisto and Alexander M. Rush “What You Get Is What You See: A Visual Markup Decompiler” In CoRR abs/1609.04938, 2016 arXiv: http://arxiv.org/abs/1609.04938
- “ICFHR2016 CROHME: Competition on Recognition of Online Handwritten Mathematical Expressions”, 2016, pp. 607–612 DOI: 10.1109/ICFHR.2016.0116
- “Wide Residual Networks” In CoRR abs/1605.07146, 2016 arXiv: http://arxiv.org/abs/1605.07146
- “The Shattered Gradients Problem: If resnets are the answer, then what is the question?” In CoRR abs/1702.08591, 2017 arXiv: http://arxiv.org/abs/1702.08591
- “Densely Connected Convolutional Networks”, 2017 DOI: 10.1109/CVPR.2017.243
- Anh Le Duc and Masaki Nakagawa “Training an End-to-End System for Handwritten Mathematical Expression Recognition by Generated Patterns”, 2017 DOI: 10.1109/ICDAR.2017.175
- “Attention is All you Need” In Advances in Neural Information Processing Systems 30 Curran Associates, Inc., 2017 URL: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
- “Residual Attention Network for Image Classification” In CoRR abs/1704.06904, 2017 arXiv: http://arxiv.org/abs/1704.06904
- “Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition” In Pattern Recognition 71 Elsevier, 2017, pp. 196–206
- “BAM: Bottleneck Attention Module” In CoRR abs/1807.06514, 2018 arXiv: http://arxiv.org/abs/1807.06514
- Jianshu Zhang, Jun Du and Lirong Dai “Multi-scale attention with dense encoder for handwritten mathematical expression recognition” In International Conference on Pattern Recognition, 2018, pp. 2245–2250
- “ICDAR 2019 CROHME + TFD: Competition on Recognition of Handwritten Mathematical Expressions and Typeset Formula Detection”, 2019, pp. 1533–1538 DOI: 10.1109/ICDAR.2019.00247
- “Multi-modal Attention Network for Handwritten Mathematical Expression Recognition”, 2019, pp. 1181–1186 DOI: 10.1109/ICDAR.2019.00191
- “Image-to-Markup Generation via Paired Adversarial Learning: Recognizing Outstanding Ph.D. Research”, 2019, pp. 18–34 DOI: 10.1007/978-3-030-10925-7˙2
- Anh Le Duc “Recognizing handwritten mathematical expressions via paired dual loss attention network and printed mathematical expressions”, 2020, pp. 2413–2418 DOI: 10.1109/CVPRW50498.2020.00291
- “MomentumRNN: Integrating Momentum into Recurrent Neural Networks” In CoRR abs/2006.06919, 2020 arXiv: https://arxiv.org/abs/2006.06919
- “Handwritten Mathematical Expression Recognition via Paired Adversarial Learning” In International Journal of Computer Vision, 2020 DOI: 10.1007/s11263-020-01291-5
- “A Tree-Structured Decoder for Image-to-Markup Generation” In Proceedings of the 37th International Conference on Machine Learning 119, Proceedings of Machine Learning Research PMLR, 2020, pp. 11076–11085 URL: https://proceedings.mlr.press/v119/zhang20g.html
- “Temporal Classification Constraint for Improving Handwritten Mathematical Expression Recognition” In Document Analysis and Recognition – ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II Lausanne, Switzerland: Springer-Verlag, 2021, pp. 113–125 DOI: 10.1007/978-3-030-86159-9˙8
- “Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer” In CoRR abs/2105.02412, 2021 arXiv: https://arxiv.org/abs/2105.02412
- Aniket Pal and Krishna Pratap Singh “R-GRU: Regularized gated recurrent unit for handwritten mathematical expression recognition” In Multimedia Tools and Applications, 2022 DOI: 10.1007/s11042-022-12889-x
- Aniket Pal and Krishna Pratap Singh “AdamR-GRUs: Adaptive momentum-based Regularized GRU for HMER problems” In Applied Soft Computing, 2023, pp. 110457 DOI: https://doi.org/10.1016/j.asoc.2023.110457